Está en la página 1de 16

Forecasting Box Office Success of Movies:

Combining Text Mining and Neural


Networks

Michael Henry, Ramesh Sharda, and Dursun Delen


Institute for Research in Information Systems
Department of Management Science and Information Systems
William S. Spears School of Business
Oklahoma State University
(Ben Johnson and Xin Cao on MFG implementation)

Forecasting Box-Office Receipts:


A Tough Problem!

“… No one can tell you how a movie is


going to do in the marketplace… not until
the film opens in darkened theatre and
sparks fly up between the screen and the
audience”
Mr. Jack Valenti
President and CEO
of the Motion Picture Association of America

2 Movie Forecasting Update: M2007.SAS


Introduction

– 3rd highest grossing movie in


2003.
– Both sequels are in the top 5
grossing films of all time.
– Combined earnings of $2.6 billion
dollars worldwide.

The Movie Times.com

3 Movie Forecasting Update: M2007.SAS

Forecasting Before a Movie is


Made

• Tough problem

• Most recent attempt (Eliashberg et al 2007).

• Our approach

4 Movie Forecasting Update: M2007.SAS


Our Approach – Movie Forecast Guru

9 DATA –Movies released between 1998-2006

9 Movie Decision Parameters:


‰ Intensity of competition rating
‰ MPAA Rating
‰ Star power
‰ Genre
‰ Technical Effects
‰ Sequel ?
‰ Estimated screens at opening
‰ …
9 Output: Box office gross receipts (flop → blockbuster)
Class No. 1 2 3 4 5 6 7 8 9
Range <1 >1 > 10 > 20 > 40 > 65 > 100 > 150 > 200
(in Millions) (Flop) < 10 < 20 < 40 < 65 < 100 < 150 < 200 (Blockbuster)

5 Movie Forecasting Update: M2007.SAS

Method: Neural Networks and


others
• Output
– Box office receipts: 9 categories
• Flop (category 1)
• Blockbuster (category 9)
– Return on Investment

• Prediction Results
– Bingo and 1-away
– ROI and MAPE

6 Movie Forecasting Update: M2007.SAS


Previous Results

Models Tested on 2006 Data


100.00%
1-Away 1-Away
1-Away
80.00%

60.00% Bingo
Bingo Bingo

40.00%

20.00%

0.00%
1998-2002 2003-2005 1998-2005
Bingo 45.82% 44.67% 49.86%
1-Away 86.46% 82.13% 86.74%

7 Movie Forecasting Update: M2007.SAS

Text Mining Experiments


• Use text mining on movie plot summaries to increase
prediction accuracy.

• The following text categories were created to analyze


plot:
– War Category
– Organization Category
– Life Category
– Character Category

8 Movie Forecasting Update: M2007.SAS


Experiment One
• Examples of concepts in each category:
– War
• World war, war of the ring, rising cold war…
– Organization
• CIA, FBI, MI6…
– Life
• Small town life, hard life, civilian life…
– Character
• Willy Wonka, Harry Potter, Bruce Wayne…

9 Movie Forecasting Update: M2007.SAS

Experiment One

1998 - 2005 model on 2006 data

100.00%

80.00%

60.00%

40.00%

20.00%

0.00%
No Category War Organization Life Character
Bingo 49.86% 53.03% 51.59% 52.16% 53.60%
1-away 88.18% 87.32% 89.63% 88.18% 89.63%

10 Movie Forecasting Update: M2007.SAS


Classification Analysis

• All category models had a higher prediction accuracy


than the original neural network model.

• Organization and character category had the best 1-


away rate.
– 1-away: 89.63%

• Character category had the best bingo rate.


– Bingo: 53.60%

11 Movie Forecasting Update: M2007.SAS

Experiment Two

• Compare results with other research.


– (Eliashberg et al 2007) - text mining

• Calculate return on investment.


– ROI = ((0.55 * revenue) – budget) / budget
– Prediction is on above or below the median
ROI
– Eliashberg et al 2007: 61% hit rate

12 Movie Forecasting Update: M2007.SAS


Experiment Two
• Training Set
– 1998 to 2005 (905 Movies)

• Test Set
– 2006 (104 Movies)

• Median ROI of Training Set


– (-42.10)%

• Highest Hit Rate


– 73%
13 Movie Forecasting Update: M2007.SAS

Experiment Two

Actual and Predicted ROI

Predicted
Actual < Median ROI > Median ROI
< Median ROI 29 25
> Median ROI 3 47

14 Movie Forecasting Update: M2007.SAS


Experiment Two
ROI Criteria for 2006 data

74%
72%
70%
68%
66%
64%
62%
60%
58%
56%
54%
Eliashberg No Category War Org Life Character
Hit Rate 61% 70.19% 67.31% 72.12% 73.08% 71.15%

15 Movie Forecasting Update: M2007.SAS

ROI Analysis

• All neural network models had a higher prediction


accuracy than Eliashberg model.

• Category life had the highest hit rate.


– Must use other performance evaluation due to 50% accuracy
before predictions are made.
• Classification
• Mean Absolute Percent Error

16 Movie Forecasting Update: M2007.SAS


Experiment Two

1998 - 2005 model on 2006 budget data


100.00%

80.00%
60.00%

40.00%

20.00%

0.00%
No Category War Organization Life Character

Bingo 41.35% 44.23% 43.27% 41.35% 43.27%


1-aw ay 82.69% 78.85% 83.65% 85.58% 83.65%

17 Movie Forecasting Update: M2007.SAS

Classification Analysis
• All category models had a higher or equal prediction
accuracy than the original neural network model.

• Category war had highest bingo rate, but lowest 1-away


rate.

• Category life had highest 1-away rate, but lowest bingo


rate.

• Category organization and character had the best overall


prediction rate.

18 Movie Forecasting Update: M2007.SAS


Experiment Two
MAPE Criteria for 2006 budget data

65.00%

64.00%

63.00%

62.00%

61.00%

60.00%

59.00%
No Category War Org Life Character

MAPE 63.97% 64.38% 61.01% 63.82% 62.40%

19 Movie Forecasting Update: M2007.SAS

MAPE Analysis
• Category organization had the lowest percent error
– Category character had the second lowest.

• Category war had the highest percent error.


– Had more specific concepts compared to other categories.

• Category life had the second highest percent error.


– Had more general concepts compared to other categories.

• Suggests that categories with specific concepts produce


the best results.

20 Movie Forecasting Update: M2007.SAS


What about recent movies?

• Create model on 1998 - 2006 data

• Test on 2007 sample data

21 Movie Forecasting Update: M2007.SAS

Sample Predictions
Movie Actual No Category War Organization Life Character
Spider-Man 3 9 9 9 9 9 9
300 9 5 7 5 5 5
The Simpsons Movie 8 9 9 9 9 9
The Bourne Ultimatum 8 6 7 9 9 7
Knocked Up 7 5 5 5 5 7
Live Free or Die Hard 7 9 7 9 9 9
Evan Almighty 6 6 7 7 7 7
1408 6 5 5 4 5 5
TMNT 5 4 3 4 5 5
Music and Lyrics 5 4 5 5 5 5
Freedom Writers 4 4 3 4 4 5
Reign Over Me 3 4 4 4 4 4
Black Snake Moan 2 3 3 3 3 3
Ta Ra Rum Pum 1 1 1 1 1 1

22 Movie Forecasting Update: M2007.SAS


Prediction Analysis
• Text mining allowed for correct classification of the following movies:
– Knocked Up, Live Free or Die Hard, TMNT, Music and Lyrics

• War category able to predict that “300” would be a financial success.

• All text mining models had 1-away accuracy for “The Bourne
Ultimatum”.

• Important Note:
– Sample does not contain enough entries to statistically prove
which model is the best overall.
– Classifications may change as revenue increase for movies that
are still in theaters.

23 Movie Forecasting Update: M2007.SAS

What about other software?

• Compare test results between SAS and Clementine


– Training: 1998 to 2005 data
– Test: 2006 data

• Performance Evaluation
– Bingo
– 1-Away

24 Movie Forecasting Update: M2007.SAS


SAS vs Clementine

1998 to 2005 model on 2006 data

SAS Clementine
Bingo 52.16% 49.86%
1-Away 89.34% 88.18%

25 Movie Forecasting Update: M2007.SAS

Results So far…
• Text mining plot summaries increases prediction
accuracy.

• Neural Networks can successfully predict return on


investment with the use of text mining.

• Neural networks can handle complex problems in


forecasting in difficult business situations.

26 Movie Forecasting Update: M2007.SAS


Making the model available to
others

27 Movie Forecasting Update: M2007.SAS

Web-Based DSS

• Information fusion (multiple method forecasting).

• Use of models not owned by the developer.

• Sensitivity Analysis.

28 Movie Forecasting Update: M2007.SAS


DSS: Movie Forecast Guru

• Forecast Methods:
– Neural Networks
– Decision Tree (CART & C5)
– Logistic Regression
– Discriminant Analysis
– Information Fusion

• .Net server

29 Movie Forecasting Update: M2007.SAS

Demo

30 Movie Forecasting Update: M2007.SAS


Conclusions
• Results continue to get better.

• Text mining can be expanded to other areas in addition


to plot summaries.

• Marketing challenge remains!

• Many other similar problems in forecasting.

• Web-DSS framework is quite powerful.

31 Movie Forecasting Update: M2007.SAS

También podría gustarte