top of page
  • armstrongWebb

Extending OpenAI's GPT-4 Capability to Query Multiple and Long Documents

Updated: Jun 19, 2023



In November 2022, OpenAI's ChatGPT arrived like a bolt out of the blue. It notched-up the fastest adoption rate for any Internet tool; gaining over a million users in a handful of days.


ChatGPT and its successors, GPT-3.5 and the substantially more powerful GPT-4 are three of an increasing range of so-called Large Language Models (LLMs). As they evolve, they are becoming increasingly effective at reasoning from text, for example being able to summarise it in a given writing style (''In the style of a management consultant, summarise...").


Even though GPT-4 has the most effective reasoning capability of the three models, if you want to use it beyond the confines of its own model it has a significant constraint; the size of the context that is its source.


Put simply, the default GPT-4 has a constraint of 8,000 tokens for both question and the response it generates. For simplicity, please treat a token as a character. For example, if you want to provide it with text that it may not be aware of, and then request it to summarise it, both the text and request need to fit within the 8,000 token constraint (strictly speaking, there is a version of GPT-4 with a 32,000 token limit).


Improvements

Even though GPT-4 has only been available for approximately a month (date of writing is May'23), other organisations are developing alternative LLMs. For example, a variant of the newly-released Mosaic MPT-7B has a 65,000 token limit, Definitely an improvement in terms of token limit.


But there is a alternative approach that significantly reduces the impact of the standard 8,000 token limit, even across many hundreds of pages of text and beyond... If you want to skip an overview of the technology underpinning it, jump straight to this video, where you'll see an example of AI Enhanced Query scanning 290+ pages split across two documents.










How is it done? Enter the Vector Database

Vector databases are designed to do at least two things well.


Firstly, to store a mathematical representation of text in an efficient manner. They are stored as vectors, such that vectors that are 'close together' in vector space have a similar textual meaning. Secondly, the databases are indexed in such a way that they can be retrieved efficiently - and that means quickly.





Hence the first task is to convert the collection of relevant pdf documents into a vector database. Unless one or more of these documents change, it is a one-off exercise.


Querying the Vector Database

Once a query search is requested, the vector database is searched for items of 'text' that are relevant to the query. The rerurned text, and associated query, are then passed to GPT-4 to process it and return the response, which can then be viewed.


A pdf of the query, response and Similarity Sources can be requested, as shown here:













131 views0 comments

Comments


bottom of page