OpenAI introduces GPT-4 Turbo: Larger memory, lower cost, new knowledge

quamquam quid loquor · Nov 6, 2023

Busy playing with the 128K context window for documents. Absolutely bonkers and I thank Anthropic for leading the way here. Unfortunately for them, it means all my spend is going to OpenAI and GPT-4. It's a winner-take-all market.

Nowicki · Nov 6, 2023

quamquam quid loquor said:
Busy playing with the 128K context window for documents. Absolutely bonkers and I thank Anthropic for leading the way here. Unfortunately for them, it means all my spend is going to OpenAI and GPT-4. It's a winner-take-all market.

128k context is great, but I suspect its going to get way more bonkers. Ive been waiting for it for some time. I want to try to feed it transcripts of things like depositions, long form misinformation documentaries, and the like to see if it weeds out the logical fallacies, counter factual statements, and general manipulation of the content reader.

It definitely does it with shorter context where I feed it transcripts of a fox news clip.

quamquam quid loquor · Nov 6, 2023

OpenAI also reported a JSON mode with valid JSON replies only.

Setting to json_object enables JSON mode. This guarantees that the message the model generates is valid JSON.

Note that your system prompt must still instruct the model to produce JSON, and to help ensure you don't forget, the API will throw an error if the string JSON does not appear in your system message. Also note that the message content may be partial (i.e. cut off) if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.

Must be one of text or json_object.

Chip O. · Nov 6, 2023

I've been having a lot of fun messing around with Bing's image generator, want furby on a firetruck in space with a rocket ship blasting through the background, including Saturn, and Alien, an orange cat, and a giraffe playing an accordion in the style of lisa frank? It's not perfect, but it's gonna try

I just wish it would give us an unrestricted version like it's initial release, those gaslighty arguments were insane

Benjamin_Dover · Nov 6, 2023

I feel a disturbance in the force, like a million startups acting as thinly-veiled thin-wrappers over OAI APIs crying out in pain as they vanish into the ether..

adespoton · Nov 6, 2023

I told Bing Chat the following:
"Please rephrase the following text with all logical fallacies, counterfactual arguments and subject manipulations removed"

and then pasted the contents of this article before the video embed (without the links). This is what it responded with (including footnotes):

On Monday, OpenAI CEO Sam Altman announced a major update to its GPT-4 language model called GPT-4 Turbo at the OpenAI DevDay event. The new model can process a much larger amount of text than GPT-4 and features a knowledge cutoff of April 2023. Altman also introduced APIs for DALL-E 3, GPT-4 Vision, and text-to-speech, and launched an “Assistants API” that makes it easier for developers to build assistive AI apps 1. During the opening keynote, Altman showcased the wider impacts of OpenAI’s AI technology in the world, including helping people with tech accessibility. Over 2 million developers are building apps using OpenAI’s APIs, over 92 percent of Fortune 500 companies are building on their platform, and ChatGPT has over 100 million active weekly users 1. Microsoft CEO Satya Nadella also made a surprise appearance on the stage, talking with Altman about the deepening partnership between Microsoft and OpenAI and sharing some general thoughts about the future of the technology, which he thinks will empower people 1.

That's actually pretty impressive, and I think I'm going to be using that feature more often in the future.

sidg11 · Nov 6, 2023

Interesting, if you feed it the first half of a novel, can it finish the novel for you

UserIDAlreadyInUse · Nov 6, 2023

We know where this all ends...

Fatesrider · Nov 6, 2023

peterford said:
As a bystander, I don't really understand why you need a super long context window. Why not maintain a prompt of a much shorter length and modify that as new information is added by the longer input? I certainly don't remember every detail of my history, code base or a book's plot, I compress it to the key details and go from there.

Given the context of some of the comments, the AI is used for more than conversational interaction. It's also used to parse data and such. The data can be short or long, but given that a lot of metadata is long, having a longer cache to use to process that data helps.

The other aspect is that the AI can keep track of a conversation LONGER than it otherwise would have. So you can refer back to something said howsoever much data ago, and the AI will be able to reference and "remember" that.

publicfacingusername · Nov 6, 2023

peterford said:
As a bystander, I don't really understand why you need a super long context window. Why not maintain a prompt of a much shorter length and modify that as new information is added by the longer input? I certainly don't remember every detail of my history, code base or a book's plot, I compress it to the key details and go from there.

For analyzing books, or book-length content. I've used it for a large number of projects in this regard—anything from querying for specific pieces of information in a nonfiction text to creating plot summaries and character breakdowns for a novel to popping in a homebrew TTRPG and using it to help with world-building and create rulesets. I find that ChatGPT works best as a brainstorming assistant, and when you have a text you're working from "together" it's a really fantastic one.

darkowl · Nov 6, 2023

I’d be interested in knowing how they’re doing this. From what I understand you can either retrain the whole model on a larger context, expensive to do, which has a much higher memory requirement. Or, smart people have figured out using quantisation to reduce the precision at the cost of lower memory pressure plus things like RoPE scaling to increase context. (Just as an example; they may have used other methods or techniques)

In both those cases precision/“accuracy” is sacrificed, but performance and requirements are kept more reasonable.

I’m guessing they won’t say if they sacrificed quality for the larger context, but it’d be interesting if there was a way to test it. I’ve seen people anecdotally (in comments and the like) saying GPT-4 is getting “worse”, but that may just be familiarity and inherent bias.

Jt21 · Nov 6, 2023

Benjamin_Dover said:
I feel a disturbance in the force, like a million startups acting as thinly-veiled thin-wrappers over OAI APIs crying out in pain as they vanish into the ether..

One door closes, another door opens. We've been trying to work around some of the frustrating limitations of current LLMs. OpenAI removed some of those limitations today, and made other things easier to build; it just means that we can focus on other things.

Podginator · Nov 6, 2023

It is a bad time in history to be cursed with chronic anxiety.

Kamus · Nov 6, 2023

Damn, I was hoping for even larger price reductions for GPT4 on the API, but I'll take it!

lucubratory · Nov 6, 2023

Podginator said:
It is a bad time in history to be cursed with chronic anxiety.

vacant eyes, unfocused, suddenly snap to yours with a crazed spark

"You know, ChatGPT helped me a lot with my anxiety, if that's an issue for you."

eyes defocus, they no longer appear to perceive you

islane · Nov 6, 2023

128K context length

WOAH.
I was not expecting this... and here I was debating the merits of 4k and 8k context on my local models. In the past month or two, the giant 70b+ local models have been getting within striking distance of GPT4 (AKA they were at least better than GPT3.5). But 128k!? Nothing even comes close that can be run locally or in a realistically-sized cloud environment.

TheManIsANobody · Nov 6, 2023

sidg11 said:
Interesting, if you feed it the first half of a novel, can it finish the novel for you

George RR Martin is furiously submitting what he has written so far for winds of winter.

Or better yet, feed the scripts for the first 5 seasons of GoT and see where it goes from there. It’s bound to be better.

chip_1 · Nov 6, 2023

islane said:
But 128k!? Nothing even comes close that can be run locally or in a realistically-sized cloud environment.

It's funny you say that, because this just dropped less than 24 hours ago. https://huggingface.co/01-ai/Yi-34B-200K

It's only a foundational model, so it's not terribly useful as-is. It needs to be finetuned and quantized before it's going to be usable. Also there's no guarantee that it's any good. With only 34b parameters to work with, even if it's excellent for what it is, it's certainly no threat to GPT4.5. But it does go to show that people are working on long context models in the local LLM space.

randomcat · Nov 6, 2023

Podginator said:
It is a bad time in history to be cursed with chronic anxiety.

Is there a good time?

Seriously, I have a time machine, if you know of one I'm outta here.

ArsScene · Nov 6, 2023

"As intelligence is integrated everywhere, we will all have superpowers on demand". he doesn't take the next logical baby-step to question how these "superpowers" will be used. i feel pretty sure mr altman hasn't seen Idiocracy, or didn't understand the film if he has.

Bravesirrobinson · Nov 6, 2023

peterford said:
As a bystander, I don't really understand why you need a super long context window. Why not maintain a prompt of a much shorter length and modify that as new information is added by the longer input? I certainly don't remember every detail of my history, code base or a book's plot, I compress it to the key details and go from there.

How I've been using it: feed a bunch of stories about myself and a job description, then have it write a cover letter using input from both. Longer context windows means I can give it way more stories to choose from when it's trying to fit them to the job description.

cadence · Nov 6, 2023

It looks like ChatGPT's knowledge was also updated to April 2023, not only the new API. I wonder if that means that ChatGPT is now running on GPT4.5-turbo. I was not able to find any official information about it so far.

aleph_nought · Nov 6, 2023

Nowicki said:
128k context is great, but I suspect its going to get way more bonkers. Ive been waiting for it for some time. I want to try to feed it transcripts of things like depositions, long form misinformation documentaries, and the like to see if it weeds out the logical fallacies, counter factual statements, and general manipulation of the content reader.

It definitely does it with shorter context where I feed it transcripts of a fox news clip.

GPT-3.5-turbo-16k already works fine with short and medium-length documents like company financial reports. I can't imagine what 128k context would be like unless you're feeding a novel or a huge internal corpus into it. I hope we can do away with the RAG technique and the hassle of searching for matching embedding vectors - just feed a giant slurp of data into the model and let it grok away.

TheFooledGhost · Nov 6, 2023

randomcat said:
Is there a good time?

Seriously, I have a time machine, if you know of one I'm outta here.

In the 5000 B.C. range? Paranoia is great when there's no civilization.

TetsFR · Nov 6, 2023

OAI changing name is now long due, because it everything but definitely not open anymore.
What about "Skynet"?

TetsFR · Nov 6, 2023

It is kind a funny to witness the level of commaraderie from Altman to Nadiella. I could almost hear him wishpering "if it was not for those 10bn you would get the fu** out of my OAI devday!"

aleph_nought · Nov 7, 2023

TetsFR said:
It is kind a funny to witness the level of commaraderie from Altman to Nadiella. I could almost hear him wishpering "if it was not for those 10bn you would get the fu** out of my OAI devday!"

The way I see it, Apple is focusing on on-device AI for image recognition, image processing, voice recognition and text-to-speech. It's great for selling devices and not much else.

Google is still trying to figure out which way to go while Facebook is lost in Zuckerbergian dreams, even as it does some solid ML research.

Microsoft is going all-out with AI on the cloud and its investment in OpenAI reflects that. We're getting to the point where Office can analyze and translate existing documents and come up with new ones. Imagine Power BI going beyond dashboards and into ML-powered forecasting and simulations. The on-device stuff could come soon with Intel and Qualcomm putting NPUs into laptop chips.

The scary part comes when AI stops assisting human decision-makers and becomes a decision maker. If Wintermute calls, just hang up.

MesserWolf · Nov 7, 2023

They are moving at breakneck speed. Impressive.

128k context window address its biggest limitation so far. Now you can feed it longer documents for summarization, to “chat” with it, to use them for a more accurate answers on certain topics. You can also expect longer answers.

I had tested Claude 2 100k to summarize a 50+ pages document and the result was already impressive. Can’t wait gpt4 turbo how it fares.

On the other side, the speed at which some jobs are being replaced either partially or completely is also accelerating.

cryophile · Nov 7, 2023

What’s going to be left for human knowledge workers to do? I fear we’re just going to end up as nothing more than spectators.

teleos · Nov 7, 2023

cryophile said:
What’s going to be left for human knowledge workers to do? I fear we’re just going to end up as nothing more than spectators.

Early retirement

mikael110 · Nov 7, 2023

It's worth mentioning that the 128K is only for text input, the output is limited to 4K according to the model page. That won't be an issue for summarization and the like, but it could be an issue for some use cases, like asking for rewrites of long documents.

Podginator · Nov 7, 2023

randomcat said:
Is there a good time?

Seriously, I have a time machine, if you know of one I'm outta here.

I know I look through heavily rose tinted glasses, and that there was also plenty to complain about then, but I feel like pre-2016 was a hopeful time.

Yes, all the social media hens came home to roost, and we got a confluence of horrible events (Trump, Brexit) - but... I don't know. Felt less existential.

DeschutesCore · Nov 7, 2023

UserIDAlreadyInUse said:
We know where this all ends...

View attachment 66885

One can hope.

orwelldesign · Nov 7, 2023

aleph_nought said:
. If Wintermute calls, just hang up.

Uhh... I'm not about to be rude to our new AI overlords. That seems unwise.

The problem with the future is that a> it's very unevenly distributed and b> everyone thinks they are Case or Molly, when the overwhelming majority will be selling rat yakitori in a smoke-filled alley.

(Bring on "Basic" // Universal Income. We really need to do it starting now, because if we wait until society collapses, it'll be too late.)

glogloglo · Nov 7, 2023

So I'm very interested in AI, and I have an obscure question that I don't know is valid or not. Right now, AI models have a limit on how much data can be input, which are the tokens. What happens if/when technology gets to the point where the token size is so large that enormous/entire swaths of the internet can be use as a single prompt? Could it ever create a feedback loop where the input from a user is almost the entirety of an AI system's own dataset? What kind of output could result from this, if this is even feasible?

I know in the past it took billions of dollars and over a decade to map the human genome, but today it's a trivial $1500 dollar task with a few days of processing time. Could the entirety of today's dataset as it stands today be an single input in future systems?

Thanks in advance if anyone has any thoughts on this.

glogloglo · Nov 7, 2023

cryophile said:
What’s going to be left for human knowledge workers to do? I fear we’re just going to end up as nothing more than spectators.

To me, the last currencies that will remain in this world will be human-made art, time, and human contact. Once all of the basics are set (Food, water, shelter, clothing, transportation, data, electricity), choosing what to do in life will be simply a matter of having a generative drive (check into Paul Conti's work on mental health) and sharing that drive with others.

PorridgeHorrendous · Nov 7, 2023

Also on Monday, OpenAI introduced what it calls "Copyright Shield," which is the company's commitment to protect its enterprise and API customers from legal claims related to copyright infringement due to using its text or image generators. The shield does not apply to ChatGPT free or Plus users. And OpenAI announced the launch of version 3 of its open source Whisper model, which handles speech recognition.

You know you're on strong ethical and legal grounds when you're compelled to offer protection from copyright litigation as a perk

john napiorkowski · Nov 7, 2023

peterford said:
As a bystander, I don't really understand why you need a super long context window. Why not maintain a prompt of a much shorter length and modify that as new information is added by the longer input? I certainly don't remember every detail of my history, code base or a book's plot, I compress it to the key details and go from there.

Not sure why you are getting voted down, it's a reasonable question. For a lot of us working with earlier versions of the system its been a game to work with the smaller context windows. Generally we are recording conversational history in like a database and using various techniques to summarize it or to reduce it such that you can maintain the illusion that the system is carrying on a long term conversation even with the smaller context windows. That game is still relevant because longer context windows don't solve everything (cost for example, and maybe you want to pare down a history to a smaller context in order to get better and faster responses.)

For example you might want to build a conversational bot that uses your application documentation as its context in order to provide better help for your users. With a smaller context window you would have to first index the docs and then match the parts of it that seems most relevant to the question being asked rather than just look at all the docs at once. There's various approaches to that. With a bigger context window you just might be able to stick all the docs into the context. However you might not want to do that anyway since the system might get confused with so much non relevant information. You might get a better response if you pare down the docs to the most relevant paragraphs. Additionally there's a cost factor. You pay by the size of the context window so you still might wish to reduce it when possible. However having the flexibility of a big context window is useful. There's only so much you can properly summarize, you always lose some information. For example if you were trying to summarize a novel with a small context window you might try summerizing each chapter and then summarizing all the summaries. That is not easy to get right, which is why you often see people doing this sort of reduction also adding in techniques like overlapping the text (having the last part of a previous chapter included in the chapter summary for example) and adding lots of metadata questions to make sure you properly track everything that is important. In those situations a bigger context window can help. And the larger context windows means your instruction prompts can get bigger and more complex as well. I suspect being able to create very big system prompts will be very useful. So in the end it's great to have the flexibility but it's not a panacea to everything that we still need as these tools evolve.

OpenAI introduces GPT-4 Turbo: Larger memory, lower cost, new knowledge

Ars Tribunus Militum

Ars Tribunus Angusticlavius

Ars Tribunus Militum

Ars Scholae Palatinae

Attachments

Seniorius Lurkius

Ars Legatus Legionis

Wise, Aged Ars Veteran

Ars Tribunus Angusticlavius

Ars Legatus Legionis

Seniorius Lurkius

Ars Tribunus Militum

Smack-Fu Master, in training

Ars Centurion

Account Banned

Ars Scholae Palatinae

Ars Scholae Palatinae

Ars Scholae Palatinae

Wise, Aged Ars Veteran

Ars Praefectus

Ars Praetorian

Wise, Aged Ars Veteran

Ars Scholae Palatinae

Ars Praefectus

Ars Scholae Palatinae

Ars Scholae Palatinae

Ars Scholae Palatinae

Ars Praefectus

Ars Praetorian

Seniorius Lurkius

Ars Centurion

Wise, Aged Ars Veteran

Ars Centurion

Ars Scholae Palatinae

Ars Tribunus Angusticlavius

Account Banned

Account Banned

Smack-Fu Master, in training

Ars Centurion