OpenAI signs AI deal with Condé Nast

Status
Not open for further replies.

Mrbonk

Ars Scholae Palatinae
886
Subscriptor
Just to keep making this part clear: nobody is making any money off of comments. We are not selling them to OpenAI. Ken has specifically stated this.

This is my understanding based on our article and Ken's comments and no insider knowledge:

The issue is the deal between Condé and Open AI removed our block on scraping Ars which was in our robots.txt file. Which means it does technically open up scraping the forum since that's under our main URL. I don't think we are aware they are, or even care, but they could. We don't see any money for it.

We are attempting to block the forum from scraping in our robots.txt file (/civis/ which is the path to all forum comments), but we have to get permission to do so since it's technically part of a larger deal, and can't just unilaterally do it.

That's my understanding at least, I don't want to comment overmuch on a deal I have not been a part of. This is all about as much transparency as I can do.
It really seems like people cancelling their subs is just a net negative to Ars as a whole for something they have no control over. I know people think you guys should be all torches and pitchforks about it and pushing back as a group. But realistically I think we all it would accomplish nothing.
Cancelling subs will just potentially cause some lost jobs, which is the opposite of what we all should want to achieve given the mass consolidation of journalism as a whole the last few decades.

It unequivocally SUCKS that at a minimum you guys are not getting a portion of the money from this deal for your work. I mean really how many people are going to ask ChatGPT to link them to the latest ARS article, but people probably will ask it questions that will be based on work you made and you will get 0 new eyes on your real work. That's fucked and CN CEO deserves a kick to the balls.

I haven't ever subbed and have been having minor financial issues lately but have been thinking about it. I donate money every month to TechDirt already and I feel after all this that when the time is right in the near future I will finally sub. We should be fighting to make sure you all still get to do your job and be able to make a living.
 
Upvote
6 (21 / -15)
It really seems like people cancelling their subs is just a net negative to Ars as a whole for something they have no control over. I know people think you guys should be all torches and pitchforks about it and pushing back as a group. But realistically I think we all it would accomplish nothing.
Cancelling subs will just potentially cause some lost jobs, which is the opposite of what we all should want to achieve given the mass consolidation of journalism as a whole the last few decades.

It unequivocally SUCKS that at a minimum you guys are not getting a portion of the money from this deal for your work. I mean really how many people are going to ask ChatGPT to link them to the latest ARS article, but people probably will ask it questions that will be based on work you made and you will get 0 new eyes on your real work. That's fucked and CN CEO deserves a kick to the balls.

I haven't ever subbed and have been having minor financial issues lately but have been thinking about it. I donate money every month to TechDirt already and I feel after all this that when the time is right in the near future I will finally sub. We should be fighting to make sure you all still get to do your job and be able to make a living.
Money is the only thing the people calling the shots understand.
 
Upvote
33 (36 / -3)
D

Deleted member 693467

Guest
It's not, to be honest. Our biggest and best argument is that we're opposed to it, they don't need it, and it stifles participation. A bunch of people canceling subs and then still hanging out here ultimately has the opposite effect as intended.

I hope people can trust that we're making the best case we can.

Stifles participation? Delete my account, please, hope it facilitates the argument.

I personally refuse to willfully support the blatant free for all data grab op that LLM / AI has become.

My opinion / take is, they acquired data without consent, used that data to train a thing, woo'd investors with the thing, then gained enough $$$ for lawyers to dig themselves out of the hole they dug, now on to the "sign up or get left behind" ultimatum. Just, no.
 
Upvote
23 (28 / -5)

PsychoArs

Ars Scholae Palatinae
986
Subscriptor
Ars comments have no monetary value to anyone else, as Ken said we are not selling them, or getting paid for them.

But to us, Ars Technica? They have tremendous value, in the sense of community, emotionally, and the continuity. We're one of the oldest continuous communities on the internet. You can go back and read stories from over 20 years ago from members who are still here posting regularly.

We have stories that are almost-myth like in their status, like the person who swallowed the 7 key from their keyboard, or Fugly the pet octopus. This is our history, and it's important.

You can even find it referenced outside of Ars. Here's a reddit post that reposts the Fugly story (top comment):


View: https://www.reddit.com/r/todayilearned/comments/32re7s/til_there_was_an_octopus_in_an_aquarium_that/


You can find it by searching on Google too, because the forum has always been indexed by search engines. It's always been public.

View attachment 88479

IMHO there's a world of difference between something being in an index that allows it to be located by a human being versus something being in an index that allows it to be manipulated by a machine learning process directed by a human being.

"Based on comments on Ars Technica, what passwords is PsychoArs likely to use?"

That's a representative example only. Obviously anywhere important PsychoArs would be using MFA, so even if an LLM could produce a list of passwords and even if one of them were correct, that list wouldn't itself be useful.

But... here's the deal. We all know better - I hope - than to post our comings and goings on social media. You shouldn't announce a vacation until you've returned from it. Because it's fairly easy for someone wanting to rob your house to use that kind of information to figure out when it's vacant.

Only... it's a lot of work to figure out my habits plus find my spouse and their habits and anyone else like dependents who might live here, plus say... cleaners, landscapers, and so on, and what their schedules are. For many households it's just... too much work so our houses are reasonably safe.

But with an LLM that has access to all of that raw data scattered in multiple places, it's only a matter of time before this sort of analysis is trivial. It's like... quantum computing versus classical encryption. Classical is fine right now because nobody's got ten thousand years of super-computer time to throw at cracking a single password. But one technical evolution and the situation changes dramatically.

I don't do social media. But I do participate is technical communities and a few hobby communities. I believe we should be drawing a line between what an individual can pull off on their own versus how automation can amplify their efforts. What I write here is intended to help the community think about or understand a topic. It is not intended to be weaponized against me or anyone else.

I do get it that someone is going to scrape this place, ignoring robots.txt no matter what. Just like ransomware groups don't give a rat's ass about what is right and what is wrong, there will be adversarial machine-learning engines screwing us over. I get that. But we don't have to say "bad guys are doing it so let's let everyone do it".

I said my last post would be my last meaningful one here but I feel this clarification is worth extending my participation for.

Hopefully comments/forum content will be permitted to be excluded but I can't imagine any incentive for OpenAI to say "oh, those negotiations we concluded with your parent company that permits us to do whatever the eff we want with whatever the eff we want, whenever the eff we want, well, yeah, sure, we'd be happy if you just hid the vast majority of text from us."
 
Upvote
31 (34 / -3)
I'm not going to rage quit Ars over this, but I will admit to being disappointed by Conde Nast's action.

If they gave any damn at all about Ars, they'd know that this would go over with the users like a lead balloon and negatively affect Ars' income. But they're going to keep any money generated by this travesty and not even offer Ars a cut to make up for the inevitable lost income.

Then they'll wonder why the place fell apart.

We're talking Musk level inability to understand their users. It's amazing. I didn't think anyone else could be that blind to their own actions.
 
Upvote
33 (36 / -3)

qwertyqwertz

Smack-Fu Master, in training
61
Subscriptor++
The company determines what goes in our robots.txt. That said, I am asking if we can do exactly that: block /civis/.
That would be fantastic.

My kneejerk reaction has been that this whole thing is shitty, especially since my initial takeaway from the article was that all content (including comments) were included in the deal. I really appreciate the willingness to engage in the comments that you, @Aurich, and other staff have shown.

I am really looking forward to a future update from Ars
 
Upvote
25 (25 / 0)
Just to keep making this part clear: nobody is making any money off of comments. We are not selling them to OpenAI. Ken has specifically stated this.

This is my understanding based on our article and Ken's comments and no insider knowledge:

The issue is the deal between Condé and Open AI removed our block on scraping Ars which was in our robots.txt file. Which means it does technically open up scraping the forum since that's under our main URL. I don't think we are aware they are, or even care, but they could. We don't see any money for it.

We are attempting to block the forum from scraping in our robots.txt file (/civis/ which is the path to all forum comments), but we have to get permission to do so since it's technically part of a larger deal, and can't just unilaterally do it.

That's my understanding at least, I don't want to comment overmuch on a deal I have not been a part of. This is all about as much transparency as I can do.

My last comment.

In other words, as of now the comments are part of the deal and CN is therefore making money off them since they are part of the deal. Let's not act like OpenAI doesn't see value in forum comments or the comments are accidentally included in the deal and no one involved thought about them. OpenAI most certainly did and I'd be shocked if CN didn't either.

So, right now, someone is making money off the comments. And I'll be really shocked if you're allowed to have the comments not included since they are no doubt something OpenAI would like clear rights to and intended to have rights to as well.
 
Upvote
17 (23 / -6)

xoe

Ars Scholae Palatinae
7,496
@Aurich @Ken Fisher the way you guys have handled this issue have made me finally take the plunge and subscribe. Especially the part where you made clear that subscription revenue goes directly to ars and not to CN. I really appreciate the work you and all ars staff do and I love the community. I can't let perfect be the enemy of good, especially in a reality where perfect is likely impossible to find. I have a bit of regret in my contribution to providing information that led some people to unsubscribe, though I take comfort knowing that someone else would have discovered and reported on that info if I had not.
 
Upvote
24 (35 / -11)
altman_thief.jpg
 
Upvote
-14 (6 / -20)

C64 raids Bungling Bay

Ars Tribunus Militum
1,963
Subscriptor
"Hey ChatGPT, based on all comments made by user XYZ, build a detailed profile of the users interests, leanings, occupation, geographic location, and provide list of possible corresponding real world ID’s.”

I guess most of this is already possible with manual labor, but vacuuming all user posts up into training data will make it a lot easier.
It's useful to change to a new account every few years, if not annually. The little pieces of personal information add up over time. As does your personal diction, which is easy for an LLM to pick up on.

I tried chatGPT before posting this. Today, it's not doxxing some of the famous posters here, but don't count on that for long.
 
Upvote
11 (11 / 0)

LauraW

Ars Scholae Palatinae
1,004
Subscriptor++
The company determines what goes in our robots.txt. That said, I am asking if we can do exactly that: block /civis/.
Got it. Thanks for the response. Hopefully they'll say yes, since it seems pretty clear that their intent was to exclude comments, but they assumed all the comments would live under /comments/.

In hindsight, and knowing the Ars commentariat, it probably would have been a good idea to make the whole "We hope to exclude comments but that's working its way through the system" thing clear in the article. It might have headed off some of the angry comments.
 
Upvote
13 (15 / -2)

herozero

Ars Scholae Palatinae
1,155
Just to keep making this part clear: nobody is making any money off of comments. We are not selling them to OpenAI. Ken has specifically stated this.

This is my understanding based on our article and Ken's comments and no insider knowledge:

The issue is the deal between Condé and Open AI removed our block on scraping Ars which was in our robots.txt file. Which means it does technically open up scraping the forum since that's under our main URL. I don't think we are aware they are, or even care, but they could. We don't see any money for it.

We are attempting to block the forum from scraping in our robots.txt file (/civis/ which is the path to all forum comments), but we have to get permission to do so since it's technically part of a larger deal, and can't just unilaterally do it.

That's my understanding at least, I don't want to comment overmuch on a deal I have not been a part of. This is all about as much transparency as I can do.
The nonsense spewing from you and Ken on this thread is breaking my heart, literally and sincerely. It’s completely tone deaf.

Not all things are beautiful because they last. Looking at the comment numbers from articles post-“we are/aren’t making money from getting a bunch of money to our corporate parent for maybe using your content with OpenAI” trainwreck announcement, I think the user community has spoken with their wallets appropriately.
 
Upvote
-9 (16 / -25)
I can't let perfect be the enemy of good, especially in a reality where perfect is likely impossible to find.
This is why although I am disappointed I don't subscribe. The staff here don't really have much control and wouldn't be able to prevent OpenAI and others from scraping even if they wanted to. And they get something out of it, which is deserved for the community and good news coverage. If Ars is gone it leave not many good alternatives for tech coverage.

I don't like not being compensated but I've never had an expectation of that. If I post on the internet I expect anybody will use it for anything, including ML. It's always been that way. It would be nice if people could export and/or delete their data but I get not wanting to do that because it does break the conversation flow. I can't demand the perfect here.
 
Upvote
-6 (2 / -8)

JustChilling

Ars Centurion
265
Subscriptor
OK I am going to have to seriously start considering deleting my are account and all of the data associated with it, as I previously said in my comments. I already very rarely comment now, so I guess I will not lose much if I do.

I also find it sad that companies have still not learnt that technology like this is just going to destroy more of their business in the long term by making people dependant on OpenAI like the way they are dependant for search on Alphabet.

Lastly does Lynch seriously think that OpenAI and co takes rights' of the original data owners seriously?

Edited.
 
Last edited:
Upvote
10 (12 / -2)

nathand496

Ars Scholae Palatinae
1,225
Looking at the comment numbers from articles post-“we are/aren’t making money from getting a bunch of money to our corporate parent for maybe using your content with OpenAI” trainwreck announcement, I think the user community has spoken with their wallets appropriately.

Comparing day to day comment numbers rather than only today's after this announcement doesn't support this statement (based on my brief comparison). But it might be interesting to look into more. I bet Aurich or someone has the comment statistics.
 
Upvote
-1 (0 / -1)
Post content hidden for low score. Show…

caramelpolice

Ars Tribunus Militum
1,669
Subscriptor
Something I think people have forgotten: Ars Technica has been monetizing our comments from the very beginning. The engagement in the forums is a large part of why many, if not the vast majority of subscribers choose to do so.

So the outrage over CN making a buck off of the forums potentially being used to help train an LLM (which is definitely already happening) falls a bit flat.
You know full well that the beef people have here is with OpenAI specifically.
 
Upvote
34 (34 / 0)
I haven't been editing old comments, and I went back to check this comment from a couple hours ago: it's not editable. If it used to be something enabled on a case-to-case basis, it seems to be universal now.
Kinda odd how that wasn't even mentioned in the article. And they could have even just allowed you to edit but kept a copy as backup so nobody would even notice. Not sure what what this move achieved other than pissing people off. In fact the comment have more pissed off people than an Elon Musk story. Impressive.
 
Upvote
2 (2 / 0)

Unsheept

Ars Praefectus
3,453
Subscriptor
I don't get why people think their comments are so valuable that they should be excluded. Reeks of self-importance that people think their comments on a website so amazingly profound that they should be omitted or that compensation should be in order.
I'm an expert in my field, with decades of experience in real-world application of that subject matter (plus leadership and people management) built through expenditure of my own time and money. My input on those subjects is absolutely worth compensation and I will not give it away on a website that claims what I post only to sell / trade it to search or AI.

I'm willing to be there's a large number of users on this site that can say the same.
 
Upvote
25 (25 / 0)

WhatDoYouHear

Wise, Aged Ars Veteran
141
This is my last post for now, it's after 6pm, I need dinner, and I just don't think I can do anything to really help much beyond what I've said.

But my utterly personal guess, based on nothing but feelings, that I cannot promise anything from? I think it's the opposite of slim. I don't think OpenAI cares at all about our user comments, and is purely interested in authoritative article content, and figuring out how they're going to move into a future where the next NY Times isn't suing them.

I could be full of shit. I have zero knowledge of the deal, I'm not involved, I cannot speak to it even if I wanted to. And if I was involved I'd probably have to say even less honestly.

Pure conjecture and opinion time!

I really think this is an oversight that could have been avoided, was lost in the shuffle, and we're going to fix it and still pay the price for it. And that's why I'm going to drink this canned Mai Tai that's more alcohol than delicious and try and take the rest of the night off.
Enjoy the well-deserved drink. I hope you are right. I will definitely still read Ars in the future, but commenting for the direct benefit of CN and OpenAI isn't really my thing. Will look forward to the hoped-for update soon. I wish I shared your optimism, but given that Reddit is essentially all non-authoritative content and OpenAI was willing to shell over cash for it, I am not overly hopeful that OpenAI is solely interested in the great stuff you guys write. The articles are what, 10% of the text on Ars? In a world that's increasingly AI-generated, they would be fools not to want the human comments, too.
 
Upvote
21 (21 / 0)
You know full well that the beef people have here is with OpenAI specifically.
It is and that has never made much sense to me. They are maybe the most prominent player but hardly the least ethical. Meta doesn't get nearly the hate it deserves and Anthropic isn't that much better than OpenAI.

If they do end up training on this forum they are going to have to correct for that. Maybe invert votes for OpenAI threads if they are going to use that metadata at all. It'd be embarrassing to have the chat agent shit all over their own company and leader.
 
Upvote
-8 (1 / -9)
Aurich: Thanks for laying out the specifics of the deal and situation so clearly.

Aurich and Ken: I'm a little confused by your line of reasoning here.

First off, Ars Technica {the property} is ABSOLUTELY generating more money today as a result of this deal than Ars Technica {the property} was generating before this deal went through. If Ars Technica content were excluded from the deal, the deal would objectively have less value.

So, what I hear you telling us is that Ars Technica {the publication} is not expecting to see a red cent of the cash being generated by Ars Techinca {the property} as a result of OpenAI monetizing Ars Technica content (both of journalists and users) roll back into making Ars Technica {the publication} a better and more robust outlet. And I assume this situation is typical among other Conde Nast publications?

I'm not sure why you think this would make anyone who supports Ars feel better about having their content scraped and monetized by OpenAI.

If we're stuck with it anyhow, I think I'd be happier if there were some kind of significant tangible benefit to at least someone. I don't feel like "... and just to be clear, there is literally ZERO upside to you as a user or to Ars Technica as a publication from this terrible thing we've started doing ..." is quite the flex you think it is.

Doesn't that just make it, like, so much more gross? They aren't even pretending like the money being generated from this will benefit their actual publications that provide the revenue boost. CN has effectively given OpenAI a license to simulate "their" journalists for all eternity and aren't even giving those journalists a cut of the deal. How the hell does that work?

Not blaming you guys in the trenches, but that doesn't remotely make it better in my mind. Self-centered user worries about comments notwithstanding, I don't see how everyone working there isn't friggin pissed as hell right now. I'd consider going on strike.

This deal sucks.
 
Upvote
53 (54 / -1)

WhatDoYouHear

Wise, Aged Ars Veteran
141
I'm an expert in my field, with decades of experience in real-world application of that subject matter (plus leadership and people management) built through expenditure of my own time and money. My input on those subjects is absolutely worth compensation and I will not give it away on a website that claims what I post only to sell / trade it to search or AI.

I'm willing to be there's a large number of users on this site that can say the same.
Ditto. I have made tens of thousands of dollars, outside of my primary job, selling my thoughts on subjects that frequently appear on Ars. I have always been OK freely donating that knowledge to the community in exchange for the insight I have received from the community. I am not OK doing it for the direct financial benefit of CN and OpenAI. Community gets it for free. Corporations pay. Pretty simple, really.

Until the "CN isn't actually selling the comments" update gets posted, see you later, Ars!
 
Upvote
36 (38 / -2)
I don't think OpenAI cares at all about our user comments
It would be foolish not to. On 95% of technical matters there are some very authoritative answers and these can be weighted with the metadata you already have.

Frequently the comments are more factual than the articles which are already excellent. On some topics I know very well I can spot mistakes but these are very often pointed out. That is extremely useful on it's own or as part of some kind of critique model as they developed recently to correct GPT-4's posts.
 
Upvote
13 (15 / -2)
My last comment.

In other words, as of now the comments are part of the deal and CN is therefore making money off them since they are part of the deal. Let's not act like OpenAI doesn't see value in forum comments or the comments are accidentally included in the deal and no one involved thought about them. OpenAI most certainly did and I'd be shocked if CN didn't either.

So, right now, someone is making money off the comments. And I'll be really shocked if you're allowed to have the comments not included since they are no doubt something OpenAI would like clear rights to and intended to have rights to as well.

Given how hard they're burning money and investors are starting to worry that they're not making any money and have nothing to make money on I'm going to assume that no one makes money.

Ditto. I have made tens of thousands of dollars, outside of my primary job, selling my thoughts on subjects that frequently appear on Ars. I have always been OK freely donating that knowledge to the community in exchange for the insight I have received from the community. I am not OK doing it for the direct financial benefit of CN and OpenAI. Community gets it for free. Corporations pay. Pretty simple, really.

Until the "CN isn't actually selling the comments" update gets posted, see you later, Ars!

This is really something I can understand, when talking about that level of insight. One person here had the same idea and that users post history had no particularly deep insights into anything, and for those comments I don't quite get it. But for actual like professional insights? Yeah, go free man.
 
Upvote
3 (3 / 0)
Status
Not open for further replies.