OpenAI introduces GPT-4 Turbo: Larger memory, lower cost, new knowledge

Nowicki

Ars Tribunus Angusticlavius
7,567
Busy playing with the 128K context window for documents. Absolutely bonkers and I thank Anthropic for leading the way here. Unfortunately for them, it means all my spend is going to OpenAI and GPT-4. It's a winner-take-all market.
128k context is great, but I suspect its going to get way more bonkers. Ive been waiting for it for some time. I want to try to feed it transcripts of things like depositions, long form misinformation documentaries, and the like to see if it weeds out the logical fallacies, counter factual statements, and general manipulation of the content reader.

It definitely does it with shorter context where I feed it transcripts of a fox news clip.
 
Upvote
58 (63 / -5)

quamquam quid loquor

Ars Tribunus Militum
2,853
Subscriptor++
OpenAI also reported a JSON mode with valid JSON replies only.

Setting to json_object enables JSON mode. This guarantees that the message the model generates is valid JSON.

Note that your system prompt must still instruct the model to produce JSON, and to help ensure you don't forget, the API will throw an error if the string JSON does not appear in your system message. Also note that the message content may be partial (i.e. cut off) if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.

Must be one of text or json_object.
 
Upvote
35 (35 / 0)

Chip O.

Ars Scholae Palatinae
656
I've been having a lot of fun messing around with Bing's image generator, want furby on a firetruck in space with a rocket ship blasting through the background, including Saturn, and Alien, an orange cat, and a giraffe playing an accordion in the style of lisa frank? It's not perfect, but it's gonna try

I just wish it would give us an unrestricted version like it's initial release, those gaslighty arguments were insane
 

Attachments

  • OIG (80).jpeg
    OIG (80).jpeg
    235.5 KB · Views: 91
Upvote
28 (30 / -2)
Post content hidden for low score. Show…

adespoton

Ars Legatus Legionis
10,711
I told Bing Chat the following:
"Please rephrase the following text with all logical fallacies, counterfactual arguments and subject manipulations removed"

and then pasted the contents of this article before the video embed (without the links). This is what it responded with (including footnotes):

On Monday, OpenAI CEO Sam Altman announced a major update to its GPT-4 language model called GPT-4 Turbo at the OpenAI DevDay event. The new model can process a much larger amount of text than GPT-4 and features a knowledge cutoff of April 2023. Altman also introduced APIs for DALL-E 3, GPT-4 Vision, and text-to-speech, and launched an “Assistants API” that makes it easier for developers to build assistive AI apps 1. During the opening keynote, Altman showcased the wider impacts of OpenAI’s AI technology in the world, including helping people with tech accessibility. Over 2 million developers are building apps using OpenAI’s APIs, over 92 percent of Fortune 500 companies are building on their platform, and ChatGPT has over 100 million active weekly users 1. Microsoft CEO Satya Nadella also made a surprise appearance on the stage, talking with Altman about the deepening partnership between Microsoft and OpenAI and sharing some general thoughts about the future of the technology, which he thinks will empower people 1.

That's actually pretty impressive, and I think I'm going to be using that feature more often in the future.
 
Upvote
-6 (4 / -10)

Fatesrider

Ars Legatus Legionis
25,127
Subscriptor
As a bystander, I don't really understand why you need a super long context window. Why not maintain a prompt of a much shorter length and modify that as new information is added by the longer input? I certainly don't remember every detail of my history, code base or a book's plot, I compress it to the key details and go from there.
Given the context of some of the comments, the AI is used for more than conversational interaction. It's also used to parse data and such. The data can be short or long, but given that a lot of metadata is long, having a longer cache to use to process that data helps.

The other aspect is that the AI can keep track of a conversation LONGER than it otherwise would have. So you can refer back to something said howsoever much data ago, and the AI will be able to reference and "remember" that.
 
Upvote
27 (30 / -3)
As a bystander, I don't really understand why you need a super long context window. Why not maintain a prompt of a much shorter length and modify that as new information is added by the longer input? I certainly don't remember every detail of my history, code base or a book's plot, I compress it to the key details and go from there.
For analyzing books, or book-length content. I've used it for a large number of projects in this regard—anything from querying for specific pieces of information in a nonfiction text to creating plot summaries and character breakdowns for a novel to popping in a homebrew TTRPG and using it to help with world-building and create rulesets. I find that ChatGPT works best as a brainstorming assistant, and when you have a text you're working from "together" it's a really fantastic one.
 
Upvote
30 (32 / -2)

darkowl

Ars Tribunus Militum
2,012
Subscriptor++
I’d be interested in knowing how they’re doing this. From what I understand you can either retrain the whole model on a larger context, expensive to do, which has a much higher memory requirement. Or, smart people have figured out using quantisation to reduce the precision at the cost of lower memory pressure plus things like RoPE scaling to increase context. (Just as an example; they may have used other methods or techniques)

In both those cases precision/“accuracy” is sacrificed, but performance and requirements are kept more reasonable.

I’m guessing they won’t say if they sacrificed quality for the larger context, but it’d be interesting if there was a way to test it. I’ve seen people anecdotally (in comments and the like) saying GPT-4 is getting “worse”, but that may just be familiarity and inherent bias.
 
Last edited:
Upvote
11 (12 / -1)

Jt21

Smack-Fu Master, in training
58
I feel a disturbance in the force, like a million startups acting as thinly-veiled thin-wrappers over OAI APIs crying out in pain as they vanish into the ether..
One door closes, another door opens. We've been trying to work around some of the frustrating limitations of current LLMs. OpenAI removed some of those limitations today, and made other things easier to build; it just means that we can focus on other things.
 
Upvote
30 (32 / -2)

lucubratory

Ars Scholae Palatinae
1,430
Subscriptor++
It is a bad time in history to be cursed with chronic anxiety.
vacant eyes, unfocused, suddenly snap to yours with a crazed spark

"You know, ChatGPT helped me a lot with my anxiety, if that's an issue for you."

eyes defocus, they no longer appear to perceive you
 
Upvote
6 (12 / -6)

islane

Ars Scholae Palatinae
917
Subscriptor
128K context length

WOAH.
I was not expecting this... and here I was debating the merits of 4k and 8k context on my local models. In the past month or two, the giant 70b+ local models have been getting within striking distance of GPT4 (AKA they were at least better than GPT3.5). But 128k!? Nothing even comes close that can be run locally or in a realistically-sized cloud environment.
 
Upvote
39 (39 / 0)

TheManIsANobody

Ars Scholae Palatinae
731
Subscriptor++
Interesting, if you feed it the first half of a novel, can it finish the novel for you
George RR Martin is furiously submitting what he has written so far for winds of winter.

Or better yet, feed the scripts for the first 5 seasons of GoT and see where it goes from there. It’s bound to be better.
 
Upvote
54 (54 / 0)

chip_1

Wise, Aged Ars Veteran
104
But 128k!? Nothing even comes close that can be run locally or in a realistically-sized cloud environment.

It's funny you say that, because this just dropped less than 24 hours ago. https://huggingface.co/01-ai/Yi-34B-200K

It's only a foundational model, so it's not terribly useful as-is. It needs to be finetuned and quantized before it's going to be usable. Also there's no guarantee that it's any good. With only 34b parameters to work with, even if it's excellent for what it is, it's certainly no threat to GPT4.5. But it does go to show that people are working on long context models in the local LLM space.
 
Upvote
24 (24 / 0)

Bravesirrobinson

Wise, Aged Ars Veteran
189
Subscriptor
As a bystander, I don't really understand why you need a super long context window. Why not maintain a prompt of a much shorter length and modify that as new information is added by the longer input? I certainly don't remember every detail of my history, code base or a book's plot, I compress it to the key details and go from there.
How I've been using it: feed a bunch of stories about myself and a job description, then have it write a cover letter using input from both. Longer context windows means I can give it way more stories to choose from when it's trying to fit them to the job description.
 
Upvote
11 (11 / 0)

cadence

Ars Scholae Palatinae
1,004
Subscriptor++
It looks like ChatGPT's knowledge was also updated to April 2023, not only the new API. I wonder if that means that ChatGPT is now running on GPT4.5-turbo. I was not able to find any official information about it so far.
 
Upvote
14 (14 / 0)
128k context is great, but I suspect its going to get way more bonkers. Ive been waiting for it for some time. I want to try to feed it transcripts of things like depositions, long form misinformation documentaries, and the like to see if it weeds out the logical fallacies, counter factual statements, and general manipulation of the content reader.

It definitely does it with shorter context where I feed it transcripts of a fox news clip.
GPT-3.5-turbo-16k already works fine with short and medium-length documents like company financial reports. I can't imagine what 128k context would be like unless you're feeding a novel or a huge internal corpus into it. I hope we can do away with the RAG technique and the hassle of searching for matching embedding vectors - just feed a giant slurp of data into the model and let it grok away.
 
Upvote
6 (6 / 0)
It is kind a funny to witness the level of commaraderie from Altman to Nadiella. I could almost hear him wishpering "if it was not for those 10bn you would get the fu** out of my OAI devday!"
The way I see it, Apple is focusing on on-device AI for image recognition, image processing, voice recognition and text-to-speech. It's great for selling devices and not much else.

Google is still trying to figure out which way to go while Facebook is lost in Zuckerbergian dreams, even as it does some solid ML research.

Microsoft is going all-out with AI on the cloud and its investment in OpenAI reflects that. We're getting to the point where Office can analyze and translate existing documents and come up with new ones. Imagine Power BI going beyond dashboards and into ML-powered forecasting and simulations. The on-device stuff could come soon with Intel and Qualcomm putting NPUs into laptop chips.

The scary part comes when AI stops assisting human decision-makers and becomes a decision maker. If Wintermute calls, just hang up.
 
Upvote
11 (11 / 0)
They are moving at breakneck speed. Impressive.

128k context window address its biggest limitation so far. Now you can feed it longer documents for summarization, to “chat” with it, to use them for a more accurate answers on certain topics. You can also expect longer answers.

I had tested Claude 2 100k to summarize a 50+ pages document and the result was already impressive. Can’t wait gpt4 turbo how it fares.

On the other side, the speed at which some jobs are being replaced either partially or completely is also accelerating.
 
Upvote
6 (6 / 0)
Is there a good time?

Seriously, I have a time machine, if you know of one I'm outta here.
I know I look through heavily rose tinted glasses, and that there was also plenty to complain about then, but I feel like pre-2016 was a hopeful time.

Yes, all the social media hens came home to roost, and we got a confluence of horrible events (Trump, Brexit) - but... I don't know. Felt less existential.
 
Upvote
-2 (2 / -4)

orwelldesign

Ars Tribunus Angusticlavius
7,311
Subscriptor++
. If Wintermute calls, just hang up.

Uhh... I'm not about to be rude to our new AI overlords. That seems unwise.

The problem with the future is that a> it's very unevenly distributed and b> everyone thinks they are Case or Molly, when the overwhelming majority will be selling rat yakitori in a smoke-filled alley.

(Bring on "Basic" // Universal Income. We really need to do it starting now, because if we wait until society collapses, it'll be too late.)
 
Upvote
9 (10 / -1)
So I'm very interested in AI, and I have an obscure question that I don't know is valid or not. Right now, AI models have a limit on how much data can be input, which are the tokens. What happens if/when technology gets to the point where the token size is so large that enormous/entire swaths of the internet can be use as a single prompt? Could it ever create a feedback loop where the input from a user is almost the entirety of an AI system's own dataset? What kind of output could result from this, if this is even feasible?

I know in the past it took billions of dollars and over a decade to map the human genome, but today it's a trivial $1500 dollar task with a few days of processing time. Could the entirety of today's dataset as it stands today be an single input in future systems?

Thanks in advance if anyone has any thoughts on this.
 
Upvote
4 (4 / 0)
What’s going to be left for human knowledge workers to do? I fear we’re just going to end up as nothing more than spectators.
To me, the last currencies that will remain in this world will be human-made art, time, and human contact. Once all of the basics are set (Food, water, shelter, clothing, transportation, data, electricity), choosing what to do in life will be simply a matter of having a generative drive (check into Paul Conti's work on mental health) and sharing that drive with others.
 
Upvote
1 (1 / 0)

PorridgeHorrendous

Smack-Fu Master, in training
13
Also on Monday, OpenAI introduced what it calls "Copyright Shield," which is the company's commitment to protect its enterprise and API customers from legal claims related to copyright infringement due to using its text or image generators. The shield does not apply to ChatGPT free or Plus users. And OpenAI announced the launch of version 3 of its open source Whisper model, which handles speech recognition.
You know you're on strong ethical and legal grounds when you're compelled to offer protection from copyright litigation as a perk :rolleyes:
 
Upvote
-2 (3 / -5)
As a bystander, I don't really understand why you need a super long context window. Why not maintain a prompt of a much shorter length and modify that as new information is added by the longer input? I certainly don't remember every detail of my history, code base or a book's plot, I compress it to the key details and go from there.
Not sure why you are getting voted down, it's a reasonable question. For a lot of us working with earlier versions of the system its been a game to work with the smaller context windows. Generally we are recording conversational history in like a database and using various techniques to summarize it or to reduce it such that you can maintain the illusion that the system is carrying on a long term conversation even with the smaller context windows. That game is still relevant because longer context windows don't solve everything (cost for example, and maybe you want to pare down a history to a smaller context in order to get better and faster responses.)

For example you might want to build a conversational bot that uses your application documentation as its context in order to provide better help for your users. With a smaller context window you would have to first index the docs and then match the parts of it that seems most relevant to the question being asked rather than just look at all the docs at once. There's various approaches to that. With a bigger context window you just might be able to stick all the docs into the context. However you might not want to do that anyway since the system might get confused with so much non relevant information. You might get a better response if you pare down the docs to the most relevant paragraphs. Additionally there's a cost factor. You pay by the size of the context window so you still might wish to reduce it when possible. However having the flexibility of a big context window is useful. There's only so much you can properly summarize, you always lose some information. For example if you were trying to summarize a novel with a small context window you might try summerizing each chapter and then summarizing all the summaries. That is not easy to get right, which is why you often see people doing this sort of reduction also adding in techniques like overlapping the text (having the last part of a previous chapter included in the chapter summary for example) and adding lots of metadata questions to make sure you properly track everything that is important. In those situations a bigger context window can help. And the larger context windows means your instruction prompts can get bigger and more complex as well. I suspect being able to create very big system prompts will be very useful. So in the end it's great to have the flexibility but it's not a panacea to everything that we still need as these tools evolve.
 
Upvote
12 (12 / 0)