Now GPT-4 (already tuned to accept questions as prompts) has been partly released to subscribers and is capturing people's imaginations again. Where ChatGPT (being based on GPT-3.5) stopped, GPT-4 still seems to produce compelling results:
Enhanced recognition of user intent as logical reasoning
In general, the evaluation data published by OpenAI shows that GPT-4 exhibits vastly enhanced detection of user intent as well as logical reasoning.
Not only did it pass the Uniform Bar Exam (a standardised US bar exam with results being transferrable among multiple US states) among the top 10 % of all test takers (as opposed to GPT-3.5 scoring among the lowest 10 %), it also achieves surprisingly good results in tests on economics and physics (scoring among the top 20 %) as well with various math tests. The latter are particularly astonishing, as these tests do not require linguistic skills but mostly mathematical logic applied to novel problems.
Similar signs of enhanced logical reasoning skills were shown in connection with drug discovery. GPT-4 was shown to be able to come up with the composition of a drug presumed to circumvent a certain patent on a substance for the same medical indication.
A known issue with ChatGPT since its release has been accuracy. Factually incorrect answers sometimes crop up among otherwise accurate information. While still not perfect, this seems to be mitigated with GPT-4. According to OpenAI, GPT-4 demonstrates increased accuracy on facts vis-à-vis ChatGPT by 19 percentage points. This roughly means that, across all domains, GPT-4 should provide accurate information 70 % to 80 % of the time.
Clearly this means that GPT4 still requires a user to closely review its output to avoid reliance on it. However, it should not be necessary to modify text generated by GPT-4 as often as before, potentially reducing workload for the user.
Also, OpenAI showed examples of GPT-4 acting as a penetration tester, finding (admittedly rather simple) vulnerabilities in the source code of a web server, as well as designing and implementing a website based on a simple sketch of the desired front end.
The latter example demonstrates another (although not yet accessible) feature of GPT-4: multimodality. GPT-4 will be able to accept images as input alongside written text. Thereby GPT-4 would close in on such lack of multimodality often holding back LLM's interaction with users in the past, as it can be hard to comprehensively describe the impression conveyed by an image.
The current pace of innovation displayed in the field of generative AI (especially GPT-type LLMs) seems difficult to overstate. More and more professional tasks may already be delegated to such AIs once it is accepted that certain inaccuracies remain. A probable outcome may not be the abolition of professions in which generative AI can be employed (such as creatives, programmers and inventors) but an enormous increase of their productivity. Imagine a programmer being able to do multiple times the work in the same time.
Clearly the apparent creativity of GPT-4 underlines its ability to be used for deceptive and other tasks deemed less desirable by large parts of society as well as the very nature of generative AI as a black box similarly to a human basing their decisions on unquantifiable "gut feelings".
And the legal issues associated with the use of generative AI and LLMs remain the same with GPT-4:
- AI making suggestions based on reasoning, which will always remain a black box at least to some degree, is a central reason for worldwide efforts to regulate its use.
- Who may be held liable in case suggestions by AI turn out to be hazardous in critical applications?
- How are such large AI models protected?
OpenAI's name used to be evocative of the impact it sought to achieve, i.e. making breakthrough AI models open source. This has changed. Obviously neither the trained GPT-3 model nor any of its successors have been made available for download. With GPT-4, OpenAI even declined to comment on model size. Given the considerable capital invested in its design and training, it is understandable that OpenAI does not intend to make any details available that would make it easier for potential competitors to create imitations. As copyright protection seems questionable, keeping any valuable AI model confidential seems the only reasonable approach to maintain at least some form of exclusivity.
May its output invalidate or diminish the value of IP rights?
The advent of GPT-4 has the potential to significantly disrupt the traditional value of intellectual property (IP), including patents and copyrights. As an advanced AI language model, GPT-4 can generate novel ideas and content at unprecedented speeds, potentially undermining the exclusivity that IP rights confer. This could lead to a saturation of similar ideas and inventions, diluting the value of individual patents and copyrighted works. If all you need to do to find a new drug is task the problem to an AI and test its output in a laboratory, what is the value of a patent granted as a reward for exceptional inventive conduct?
 OpenAI seems to be fully aware of this and has implemented its own moral filter: https://arxiv.org/pdf/2303.08774.pdf, p 12 et seq.