Is our machine learning? Ars takes a dip into artificial intelligence

xeoph · Jul 13, 2021

Headlines you hate and the computers that write them, more at 11.

Nowicki · Jul 13, 2021

Learn how this scientist improved vaccines with this one simple trick!!

Karens hate them

orwelldesign · Jul 13, 2021

Huh. Usually headlines with a question mark are "No"s, this one's a pretty aggressive "maybe."

Yay?

Looking forward to the rest of the series, frankly, though I do wish you'd anti-train it against being clickbaity.

This is, regardless, exactly the sort of thing that has me coming back for more. Wish there was an "upvote this article" button.

Shanrak · Jul 13, 2021

Ars has given me data on over 5,500 headline tests over the past four years—11,000 headlines, each with their rate of click-throughs.

Garbage in garbage out as they say. If the only input is # of clicks-throughs to an article, you are going to end up with clickbait headlines. I hope you have more data than that, curious to see what you come up with.

Either way, it should at least be entertaining.

Yes it should

AquaSandwalker · Jul 13, 2021

Sweet, I look forward to the next instalments!

Also, wanted to drop a couple of book recommendations for those that want to learn more about ML algorithms (a.k.a. stats) at a conceptual level:
1. Hello World: How to be Human in the Age of the Machine by Hannah Fry.
I found this to be a good place to start, very easy to read and more of a casual read, it gave me a good overview of where ML is used and some of the relevant concerns.
2. The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World by Pedro Domingos.
This one goes more in-depth on how ML algorithms work, with neat historical details on how they originated, while it ponders the future of machine 'intelligence'.

Callias · Jul 13, 2021

Very good article, in part because I really liked the case study of using something so seemingly simple — headline writing — to illustrate something so complicated, challenging and ever evolving.

We’ve all been in situations where you have to pick a decent title for an email to get it to standout from the 250+ a day your boss gets, or trying to craft a decent summary in title form of a reddit post and question you are making so you get helpful responses (Yes, it can happen), or even picking the right wording for a technical support issue with a vendor.

So this “case study” has immediate “Oh, get it” and can figure out some applicability in my life. Looking forward to the next installment.

seanmgallagher · Jul 13, 2021

Lonyo said:
You mean can Ars get an AI to make click-batier headlines?

Headline writing is not a problem. Headline choices are.

You're going to enjoy watching this project fail, then.

Unclebugs · Jul 13, 2021

When I read what you were trying to do, I immediately thought of something I stumbled upon 23 years ago, latent semantic analysis (LSA). A professor who worked part-time at New Mexico State University, just 30 miles from my abode in El Paso, TX, was promoting this, and as a teacher of language arts grading essays, I was curious if this if this technique could be adapted to grade essays and it was. I got on the mailing list and next thing I know I receive an invitation to a classified ONI conference (Office of Naval Intelligence). I am now retired but spent 10 years as a journalist often writing headlines, and I also spent 11 years as a journalism teacher teaching students how to write headlines. If you can get this thing to work, I'm sure the Journalism Educator's Association or JEA (www.jea.org) would be interested as they have fully developed curriculum on writing headlines which you can find at the website under educator resources. Good luck!

DistinctivelyCanuck · Jul 13, 2021

Sean:

thanks for this, and like a couple of other comments, I also like the
idea of the use of "headlines" as the target.

Should be fun.

autoteleology · Jul 13, 2021

I can't wait for this upcoming Ars headline:

"SpaceX says: "just one low price to put users of new iOS Trump video game into space for a million years, without Facebook, Twitter, or windows"

Mustachioed Copy Cat · Jul 13, 2021

Here is a task that some Ars writers are exceptionally good at: writing a solid headline. (Beth Mole, please report to collect your award.)

There’s a Marvel crossover series called Fear Itself wherein Wolverine gains armor made from the same metal as Thor’s hammer, which turns his claws a lambent orange, as though from tremendous heat, and gives him a set of secondary spikes that appear to protrude from other locations on his body (though if this is a magical effect of the armor or a physical attachment to the armor is never made clear). Marvel heard that Wolverine fans liked claws, so covered him in claws and gave the claws the ability to cook whatever they happened to cut, or pierce.

I feel like giving Beth Mole encouragement is about the same as giving Wolverine extra, heated claws/spikes. It’s unnecessary, gratuitous, and is going to result in a lot of dead bodies.

symbolic-logician · Jul 13, 2021

As a fellow ML enthusiast, I eagerly await the tales of your adventures! Onward!

Static and Noise · Jul 13, 2021

Shanrak said:
Ars has given me data on over 5,500 headline tests over the past four years—11,000 headlines, each with their rate of click-throughs.

Click to expand...

Garbage in garbage out as they say. If the only input is # of clicks-throughs to an article, you are going to end up with clickbait headlines. I hope you have more data than that, curious to see what you come up with.

Are you sure that the Ars audience clicks quicker on the "clickbaity" headlines than the "good" ones in A/B testing?

Clickbaity doesn't automatically mean it gets more clicks in any given audience. It's a style, that I think most here at Ars despise.

donpsychote · Jul 13, 2021

I performed a similar experiment years ago, only I scored the title appeal based on the number of social media shares. You have A/B data, which is different in that you can discriminate between different wordings. But I had very similar titles from different media sources, with wildly different engagement.

The main problem is that most of the traffic to any site comes from social media, and readers there read them based on their own criteria, which is partly based on temporal context, in other words, what is happening at the time. A story about some event may be very attractive regardless of the title, but after the media saturation, readers just won't click it no matter how well written the title is.

I then concluded the experiment with a mental note to repeat it some day, but figure out how to pass the temporal context in the classifier as well.

Cow Towner · Jul 13, 2021

I wonder if the images associated with headlines have anything to do with the success rate of clicking the headline?

Wickwick · Jul 13, 2021

Given the Ars community, I think you're missing an opportunity to make this a competition. If anyone had been paying close attention, the A/B headlines are all public as well as the winner. So why not make the database public and let your readers compete for the best-performing system on a withheld validation set?

Better yet, ask one of the other Conde Nast sites for some A/B headlines to see how general the results are.

Wickwick · Jul 13, 2021

papa60 said:
I wonder if the images associated with headlines have anything to do with the success rate of clicking the headline?

This isn't a matter of clickthrough percentage for a story. This is the difference in how many people click through on headline A vs. headline B for the same subtitle and image.

Edit: although, the image is a confounding variable because it does act in concert with the chosen headline text. That would require several images as well as A/B headlines to suss out.

Wickwick · Jul 13, 2021

donpsychote said:
I performed a similar experiment years ago, only I scored the title appeal based on the number of social media shares. You have A/B data, which is different in that you can discriminate between different wordings. But I had very similar titles from different media sources, with wildly different engagement.

The main problem is that most of the traffic to any site comes from social media, and readers there read them based on their own criteria, which is partly based on temporal context, in other words, what is happening at the time. A story about some event may be very attractive regardless of the title, but after the media saturation, readers just won't click it no matter how well written the title is.

I then concluded the experiment with a mental note to repeat it some day, but figure out how to pass the temporal context in the classifier as well.

As I recall in previous discussions of this, the A/B process is only carrier out for a short while - an hour perhaps? Then the winning headline is used. The idea being, you're trying to get the most clicks. It doesn't make sense to continue running the test once you know which of the two is more likely to convert to a click-through.

Deleted member 221201 · Jul 13, 2021

Couple of minor suggestions

1. You could probably get away with using google colab depending on your dataset size

2. You could run the headline through a word2vec & strip stopwords (or not) & then use a simple
randomforest classifier for a starter model
I'm assuming your target is number of clicks which you can threshold 0/1 for binary or bin it

3. If you really need lexical analysis then a bidirectional lstm will do the job, but depending on datasize aws or just let it run for a few hours to train on colab
Again this depends on how many layers you are stacking up & I suggest using Keras as a wrapper for tensorflow

4. If you want a pre-made model then look at BERT.

5. Up-sample/down-sample as needed or adjust class weights if needed & run the sklearn classification report at the end

Have fun

seanmgallagher · Jul 13, 2021

Wickwick said:
donpsychote said:

I performed a similar experiment years ago, only I scored the title appeal based on the number of social media shares. You have A/B data, which is different in that you can discriminate between different wordings. But I had very similar titles from different media sources, with wildly different engagement.

The main problem is that most of the traffic to any site comes from social media, and readers there read them based on their own criteria, which is partly based on temporal context, in other words, what is happening at the time. A story about some event may be very attractive regardless of the title, but after the media saturation, readers just won't click it no matter how well written the title is.

I then concluded the experiment with a mental note to repeat it some day, but figure out how to pass the temporal context in the classifier as well.

Click to expand...

As I recall in previous discussions of this, the A/B process is only carrier out for a short while - an hour perhaps? Then the winning headline is used. The idea being, you're trying to get the most clicks. It doesn't make sense to continue running the test once you know which of the two is more likely to convert to a click-through.

It's a very very short test, not even an hour. Any changes to headlines after an hour are deliberate human efforts to fix flawed headlines. At least, as far as I know. I don't work here anymore.

seanmgallagher · Jul 13, 2021

madmax559 said:
Couple of minor suggestions

1. You could probably get away with using google colab depending on your dataset size

2. You could run the headline through a word2vec & strip stopwords (or not) & then use a simple
randomforest classifier for a starter model
I'm assuming your target is number of clicks which you can threshold 0/1 for binary or bin it

3. If you really need lexical analysis then a bidirectional lstm will do the job, but depending on datasize aws or just let it run for a few hours to train on colab
Again this depends on how many layers you are stacking up & I suggest using Keras as a wrapper for tensorflow

4. If you want a pre-made model then look at BERT.

5. Up-sample/down-sample as needed or adjust class weights if needed & run the sklearn classification report at the end

Have fun

Yeah, I'm several weeks into this game right now and I wish we had talked sometime in May,

Shanrak · Jul 13, 2021

Static and Noise said:
Shanrak said:

Ars has given me data on over 5,500 headline tests over the past four years—11,000 headlines, each with their rate of click-throughs.

Click to expand...

Garbage in garbage out as they say. If the only input is # of clicks-throughs to an article, you are going to end up with clickbait headlines. I hope you have more data than that, curious to see what you come up with.

Click to expand...

Are you sure that the Ars audience clicks quicker on the "clickbaity" headlines than the "good" ones in A/B testing?

Clickbaity doesn't automatically mean it gets more clicks in any given audience. It's a style, that I think most here at Ars despise.

More clicks = clickbaity regardless of audience

It remains to be seen if the typical Ars audience prefers a different form of clickbait headline than what is typically considered clickbait

Wickwick · Jul 13, 2021

seanmgallagher said:
madmax559 said:

Couple of minor suggestions

1. You could probably get away with using google colab depending on your dataset size

2. You could run the headline through a word2vec & strip stopwords (or not) & then use a simple
randomforest classifier for a starter model
I'm assuming your target is number of clicks which you can threshold 0/1 for binary or bin it

3. If you really need lexical analysis then a bidirectional lstm will do the job, but depending on datasize aws or just let it run for a few hours to train on colab
Again this depends on how many layers you are stacking up & I suggest using Keras as a wrapper for tensorflow

4. If you want a pre-made model then look at BERT.

5. Up-sample/down-sample as needed or adjust class weights if needed & run the sklearn classification report at the end

Have fun

Click to expand...

Yeah, I'm several weeks into this game right now and I wish we had talked sometime in May,

This IS Ars...

I mean, how often does one of the authors of a paper being covered show up in the comments? Or someone that worked on said widget, etc.

graylshaped · Jul 13, 2021

It's a little sad to see people pissing and moaning over efforts to use technology to create compelling headlines. Hey, let's explore using the stuff we write about to help our actual business!

But, small people gonna be small.

Static and Noise · Jul 13, 2021

Shanrak said:
Static and Noise said:

Shanrak said:

Ars has given me data on over 5,500 headline tests over the past four years—11,000 headlines, each with their rate of click-throughs.

Click to expand...

Garbage in garbage out as they say. If the only input is # of clicks-throughs to an article, you are going to end up with clickbait headlines. I hope you have more data than that, curious to see what you come up with.

Click to expand...

Are you sure that the Ars audience clicks quicker on the "clickbaity" headlines than the "good" ones in A/B testing?

Clickbaity doesn't automatically mean it gets more clicks in any given audience. It's a style, that I think most here at Ars despise.

Click to expand...

More clicks = clickbaity regardless of audience

It remains to be seen if the typical Ars audience prefers a different form of clickbait headline than what is typically considered clickbait

I disagree that more clicks inherently means something is clickbait. For something to be clickbait, it has to be sensationalist/misleading/overpromising/etc the actual content of the article.

No one you know · Jul 13, 2021

autoteleology said:
I can't wait for this upcoming Ars headline:

"SpaceX says: "just one low price to put users of new iOS Trump video game into space for a million years, without Facebook, Twitter, or windows"

That's twitter. Headline is 70 characters.

Wickwick · Jul 13, 2021

Static and Noise said:
Shanrak said:

Static and Noise said:

Shanrak said:

Ars has given me data on over 5,500 headline tests over the past four years—11,000 headlines, each with their rate of click-throughs.

Click to expand...

Garbage in garbage out as they say. If the only input is # of clicks-throughs to an article, you are going to end up with clickbait headlines. I hope you have more data than that, curious to see what you come up with.

Click to expand...

Are you sure that the Ars audience clicks quicker on the "clickbaity" headlines than the "good" ones in A/B testing?

Clickbaity doesn't automatically mean it gets more clicks in any given audience. It's a style, that I think most here at Ars despise.

Click to expand...

More clicks = clickbaity regardless of audience

It remains to be seen if the typical Ars audience prefers a different form of clickbait headline than what is typically considered clickbait

Click to expand...

I disagree that more clicks inherently means something is clickbait. For something to be clickbait, it has to be sensationalist/misleading/overpromising/etc the actual content of the article.

The headline "Scientists Invent Faster-Than-Light Travel" is going to get clicks. If it's actually reporting on such work then it's not clickbait. If it's followed by "...in a new Netflix series" when you click on the article then it's clickbait.

Deleted member 221201 · Jul 13, 2021

seanmgallagher said:
madmax559 said:

Couple of minor suggestions

1. You could probably get away with using google colab depending on your dataset size

2. You could run the headline through a word2vec & strip stopwords (or not) & then use a simple
randomforest classifier for a starter model
I'm assuming your target is number of clicks which you can threshold 0/1 for binary or bin it

3. If you really need lexical analysis then a bidirectional lstm will do the job, but depending on datasize aws or just let it run for a few hours to train on colab
Again this depends on how many layers you are stacking up & I suggest using Keras as a wrapper for tensorflow

4. If you want a pre-made model then look at BERT.

5. Up-sample/down-sample as needed or adjust class weights if needed & run the sklearn classification report at the end

Have fun

Click to expand...

Yeah, I'm several weeks into this game right now and I wish we had talked sometime in May,

Please stick with pandas to load the dataset & use sklearn or keras if making a deep learning model, it will save you a lot of grief

1. Use df.loc for slicing & avoid copies in pandas

2. For keras something like this

Code:

X = np.array(df[features].tolist())     
Y = your_trained)model.predict(X)   <-- super fast vectorized op  
df['prediction']  = pd.Series(y[:,0])

Doing the same thing above using df.apply() is asking for trouble & a lot of wasted hours

OtherSystemGuy · Jul 13, 2021

I see a couple of things here. On the data, on commenter mentioned the picture associated with the article. I'd add the short text under the title and the author's name. I actually look at those as well to decide what I'll read. And that takes me into the data analysis part...

Since you have the full count of clicks, one analysis would be to simply plot the data to see how far apart the winner and loser are from each other and use that as a new data point. Titles that are far apart are more interesting and should take greater weight than those close to each other.

I'm going to watch this series with interest because I'm not sure how well this is going to work out. I suspect that phrase structure is probably going to be important. How often does Beth's punny titles generate a click? An algorithm looking solely at words will miss the crafting of word placement and relationship.

I do have to keep reminding myself that this project is a title scoring system and not a title generator.

DoomdayX · Jul 13, 2021

Please consider discussing AI Ethics as a part of this series!!!

Here is a good resource by Dr. Rachel Thomas, co-founder of fast.ai:
https://youtube.com/playlist?list=PLtmW ... OD41I4GywR

entropy_wins · Jul 13, 2021

Wickwick said:
Static and Noise said:

Shanrak said:

Static and Noise said:

Shanrak said:

Ars has given me data on over 5,500 headline tests over the past four years—11,000 headlines, each with their rate of click-throughs.

Click to expand...

Garbage in garbage out as they say. If the only input is # of clicks-throughs to an article, you are going to end up with clickbait headlines. I hope you have more data than that, curious to see what you come up with.

Click to expand...

Are you sure that the Ars audience clicks quicker on the "clickbaity" headlines than the "good" ones in A/B testing?

Clickbaity doesn't automatically mean it gets more clicks in any given audience. It's a style, that I think most here at Ars despise.

Click to expand...

More clicks = clickbaity regardless of audience

It remains to be seen if the typical Ars audience prefers a different form of clickbait headline than what is typically considered clickbait

Click to expand...

I disagree that more clicks inherently means something is clickbait. For something to be clickbait, it has to be sensationalist/misleading/overpromising/etc the actual content of the article.

Click to expand...

The headline "Scientists Invent Faster-Than-Light Travel" is going to get clicks. If it's actually reporting on such work then it's not clickbait. If it's followed by "...in a new Netflix series" when you click on the article then it's clickbait.

That's what google has become. They mention an actor, say, but the image will be someone that appeared with the actor who might be more famous.

Not sure how we solve this...

S

Wickwick · Jul 13, 2021

OtherSystemGuy said:
I see a couple of things here. On the data, on commenter mentioned the picture associated with the article. I'd add the short text under the title and the author's name. I actually look at those as well to decide what I'll read. And that takes me into the data analysis part...

Since you have the full count of clicks, one analysis would be to simply plot the data to see how far apart the winner and loser are from each other and use that as a new data point. Titles that are far apart are more interesting and should take greater weight than those close to each other.

I'm going to watch this series with interest because I'm not sure how well this is going to work out. I suspect that phrase structure is probably going to be important. How often does Beth's punny titles generate a click? An algorithm looking solely at words will miss the crafting of word placement and relationship.

I do have to keep reminding myself that this project is a title scoring system and not a title generator.

The trouble is the subheading and the picture are the same regardless of the A/B heading. So are you suggesting including that in the training set as part of the data for each title? That could only help in choosing a winner for contextually-based models. Otherwise they add data without any differentaition.

NinjaNerd56 · Jul 13, 2021

seanmgallagher said:
madmax559 said:

Couple of minor suggestions

1. You could probably get away with using google colab depending on your dataset size

2. You could run the headline through a word2vec & strip stopwords (or not) & then use a simple
randomforest classifier for a starter model
I'm assuming your target is number of clicks which you can threshold 0/1 for binary or bin it

3. If you really need lexical analysis then a bidirectional lstm will do the job, but depending on datasize aws or just let it run for a few hours to train on colab
Again this depends on how many layers you are stacking up & I suggest using Keras as a wrapper for tensorflow

4. If you want a pre-made model then look at BERT.

5. Up-sample/down-sample as needed or adjust class weights if needed & run the sklearn classification report at the end

Have fun

Click to expand...

Yeah, I'm several weeks into this game right now and I wish we had talked sometime in May,

Ah, but you haven’t lived until you’ve tried to explain how ML translates to direct and indirect revenue streams to a roomful of suits, most of whom need IT to ‘fix’ their laptops regularly.

So fun….

OtherSystemGuy · Jul 13, 2021

Wickwick said:
The trouble is the subheading and the picture are the same regardless of the A/B heading. So are you suggesting including that in the training set as part of the data for each title? That could only help in choosing a winner for contextually-based models. Otherwise they add data without any differentaition.

Agreed, but that's my concern about the limited scope of the experiment and why I'm interested in following along.

ukeandhike · Jul 13, 2021

Love this, should be interesting even if it falls spectacularly flat.

Once you’ve done this and (hopefully) have a functional model, you should put the model up against Beth and do more A/B testing of Beth vs the Machine… so have the model run with everyone else’s headlines and then whatever the model says is the most likely to win out, pit THAT headline against Beth’s creations.

Shotgunosine · Jul 13, 2021

Is there any chance of releasing the data publicly, maybe under a non-commercial license, so we can play along at home?

adespoton · Jul 13, 2021

autoteleology said:
I can't wait for this upcoming Ars headline:

"SpaceX says: "just one low price to put users of new iOS Trump video game into space for a million years, without Facebook, Twitter, or windows"

You've got a problem with that headline: it's 130 characters. See if you can shorten it to 70!

Is our machine learning? Ars takes a dip into artificial intelligence

Ars Scholae Palatinae

Ars Tribunus Angusticlavius

Ars Tribunus Angusticlavius

Ars Scholae Palatinae

Seniorius Lurkius

Ars Scholae Palatinae

Ars Tribunus Militum

Ars Praefectus

Ars Tribunus Militum

Ars Centurion

Ars Praefectus

Ars Centurion

Account Banned

Smack-Fu Master, in training

Ars Scholae Palatinae

Ars Legatus Legionis

Ars Legatus Legionis

Ars Legatus Legionis

Deleted member 221201

Guest

Ars Tribunus Militum

Ars Tribunus Militum

Ars Scholae Palatinae

Ars Legatus Legionis

Ars Legatus Legionis

Account Banned

Wise, Aged Ars Veteran

Ars Legatus Legionis

Deleted member 221201

Guest

Ars Scholae Palatinae

Wise, Aged Ars Veteran

Ars Tribunus Militum

Ars Legatus Legionis

Ars Praefectus

Ars Scholae Palatinae

Ars Scholae Palatinae

Seniorius Lurkius

Ars Legatus Legionis