Share with your friends










Submit

Analytics Magazine

Data Mining: Predicting stock price movements

January/February 2011

CLICK HERE TO GO TO THE DIGITAL VERSION OF THIS ARTICLE

INFORMS data mining contest attracts 894 participants representing 147 teams from 27 countries.

By Durai Sundaramoorthi, Philipp e Belanger and Louis Duclos-Gosselin

Cole Harris of Exagen Diagnostics (www.exagen.com) won the 2010 Data Mining Contest that required participants to develop a predictive analysis solution to predict stock price movement (increase or decrease) in “the next 60 minutes” in five-minute intervals. (For example, at 9:30 a.m. will XYZ’s stock price increase or decrease at 10:30 a.m. based on historical data?)

Christopher Hefele of AT&T finished second and Nan Zhou from the University of Pittsburgh placed third in the contest organized and sponsored by the INFORMS Data Mining Section. The contest drew 894 participants from 147 teams representing 27 countries, making it the largest event of its kind in the world.

Day traders, mutual fund traders and hedge funds have always tried to predict the direction of stock prices in the next few hours. Predictive analysis solutions developed in the contest aim to pursue this objective. Hedge funds can use these types of solutions to build complex strategies to be executed automatically. Mutual fund traders can use them to achieve a “best execution” of fund managers’ buy/sell orders, while day traders can use them to realize fast profits over short periods of time.

The predictive analysis solutions produce valuable business results in the real world applications by building recommendations systems.

Here’s how it works:

First, the data miners develop the data mining techniques by:

  • making the different databases communicate with each other to collect all the information on the back office databases;
  • extracting the needed data and dealing with missing data and outliers;
  • preparing the data for the data mining algorithm;
  • writing the code relative to the data mining algorithm in the company systems;
  • predicting the data from the test database, to validate the performance of the data mining techniques; and
  • writing the code relative to the implementation of all the processes on the company systems.

To verify that the data mining algorithm will work when they implement it, the “miners” predict (score) the test database observations and validate the performance. The data mining algorithm should have similar performance in the training database and in the test database to be considered a “superior performer.” The data mining algorithm with the best performance is implemented. The test database generally represents 30 percent of the database.

When the implementation of the data mining techniques is complete, one code runs each five minutes. The code collects, from the database, the different information needed, applies the data mining algorithm, predicts the studied stock price will increase or decrease in 60 minutes and recommends the appropriate action to maximize the profit. Figure 1 illustrates this process.

In the INFORMS Data Mining Contest, participants were provided with a set of macro-economic and high frequency financial data to build their predictive analysis solutions. The data were composed of stock prices, sector indexes, economic indicators and expert predictions on economic indicators. The database was separated into two data sets: the training set for building predictive analysis model(s) and the test set (in which the target variable has been excluded) for evaluating participants’ predictions. The participants built their predictive analysis solutions using the training set and implemented it on the test set by predicting target variable.

The top three finishers in the contest presented their methods at the 2010 INFORMS Annual Meeting in Austin, Texas, in November. To view slides of their solutions, see: www.kaggle.com/informs2010.


Durai Sundaramoorthi is an assistant professor at Missouri Western State University, Philippe Belanger of Laval University served as co-chair of the 2010 Data Mining Contest, and Louis Duclos-Gosselin (louis.gosselin@hotmail.com) of Sinapse chaired the contest, a role he will continue in 2011.

CLICK HERE TO GO TO THE DIGITAL VERSION OF THIS ARTICLE



Headlines

Former INFORMS President Cook named to U.S. Census committee

Tom Cook, a former president of INFORMS, a founding partner of Decision Analytics International and a member of the National Academy of Engineering, was recently named one of five new members of the U.S. Census Bureau’s Census Scientific Advisory Committee (CSAC). The committee meets twice a year to address policy, research and technical issues relating to a full range of Census Bureau programs and activities, including census tests, policies and operations. The CSAC will meet for its fall 2018 meeting at Census Bureau headquarters in Suitland, Md., Sept. 13-14. Read more →

Gartner identifies six barriers to becoming a digital business

As organizations continue to embrace digital transformation, they are finding that digital business is not as simple as buying the latest technology – it requires significant changes to culture and systems. A recent Gartner, Inc. survey found that only a small number of organizations have been able to successfully scale their digital initiatives beyond the experimentation and piloting stages. “The reality is that digital business demands different skills, working practices, organizational models and even cultures,” says Marcus Blosch, research vice president at Gartner. Read more →

Innovation and speculation drive stock market bubble activity

A group of data scientists conducted an in-depth analysis of major innovations and stock market bubbles from 1825 through 2000 and came away with novel takeaways of their own as they found some very distinctive patterns in the occurrence of bubbles over 175 years. The study authors detected bubbles in approximately 73 percent of the innovations they studied, revealing the close relationship between innovation and stock market bubbles. Read more →

UPCOMING ANALYTICS EVENTS

INFORMS-SPONSORED EVENTS

INFORMS Annual Meeting
Nov. 4-7, 2018, Phoenix

Winter Simulation Conference
Dec. 9-12, 2018, Gothenburg, Sweden

OTHER EVENTS

Applied AI & Machine Learning | Comprehensive
Sept. 10-13, 17-20 and 24-25


Advancing the Analytics-Driven Organization
Sept. 17-20, 12-5 p.m. LIVE Online


The Analytics Clinic: Ensemble Models: Worth the Gains?
Sept. 20, 11 a.m.-12:30 p.m.


Predictive Analytics: Failure to Launch Webinar
Oct. 3, 11 a.m.


Advancing the Analytics-Driven Organization
Oct. 1-4, 12 p.m.-5 p.m.


Applied AI & Machine Learning | Comprehensive
Oct. 15-19, Washington, D.C.


Making Data Science Pay
Oct. 29 -30, 12 p.m.-5 p.m.


CAP® EXAM SCHEDULE

CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:


 
For more information, go to 
https://www.certifiedanalytics.org.