Is Tesla overvalued? How to do stock valuation with machine learning.

Model basic anatomy

  • Dataset: The latest balance sheet, income statement, and cash flow statement from each stock of the S&P 1500. The S&P 1500 is the combination of the S&P 500 (large-cap stocks), S&P 600 (mid-cap stocks), and S&P 400 (small-cap stocks). Because of missing data, I had to discard 93 stocks.
  • Input variables (Xs): A total of 16 different financial figures for each company, pulled from the last quarterly filing (10Q). Think income, revenue, assets, dividends-paid, revenue-growth, etc.
  • Output variable (Y): Market cap. This is the stock price multiplied by the number of shares outstanding. You can think of the market cap as the price to buy the entire company. Tesla’s market cap is currently about $700 billion.

Conventional stock valuation

How do you identify an undervalued stock?

Price/Earnings (P/E)

Earnings vs. Market Cap. TSLA is red.

Price/Sales (P/S)

Sales vs. Market Cap. TSLA is red.

Price/Book (P/B)

Book Value vs. Market Cap. TSLA is red.

Dividend yield

Dividend Yield vs. Market Cap. TSLA is red.

The problem with valuation ratios

TSLA figures from MorningStar. S&P 500 figures from multpl.

Model Design


Algorithm selection

  • Linear regression
  • Polynomial regression
  • Linear SVMs
  • Decision trees
  • Neural networks

Model inputs

  1. Lack of consistency: Apart from a handful of top-line items like revenue, earnings, and assets, companies are surprisingly different in what they choose to report and how they choose to calculate their figures.
  2. Overfitting: Overfitting is more likely when there is a limited dataset. My dataset only contained 1407 individual data points.


  • Dataset: 1407 stocks
  • Algorithm: XGBoost in a 100-model ensemble
  • Runs: 100 random test/train splits
  • Average test r²: 0.95

Problems with the model / next steps

  1. More data != better results: While we’re using the S&P 1500 stocks for the dataset, we would like to use the 6000+ stocks from the US market (NYSE + Nasdaq). However, attempting to use this larger dataset throws off the model. Our hypothesis is that S&P selects their index constituents partially for the rationality of their market caps. However, a good stock valuation model should be able to predict any market cap.
  2. Model jitter: Given that financial fundamentals only change once-per-quarter, price estimates also shouldn’t swing much day-to-day. However, because the market caps change, the model itself also tends to swing. We think this is partially explained by XGBoost, which is a decision tree algorithm. Decision tree algorithms have a reputation for being more “touchy” like this. The problem might be remedied with a different algorithm or by averaging market caps over several days/weeks.
  3. Better explainability: While the model has a high r², the input dimensions are somewhat discordant. Some are absolute values, others are ratios, others seem like repeats. Missing, but not for lack of effort, are any input dimensions that capture the “acceleration” of a company — e.g. the change of the change.

So should I buy Tesla stock?




Author of mongita & code2flow. Working on FFER & fastmap.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Know Inside Position — Visual Cues

The View from Above: How Satellites and Drones Can Complement Monitoring

Artificial Intelligence series_part 4B: Data Visualization in Python

Data Visualization: How to choose the right chart (Part 1)

Pandas DataFrame cell value transformations in a nutshell

A Review of Named Entity Recognition (NER) Using Automatic Summarization of Resumes

3 Tips To Become A Better Data Scientist

4D Result Predict

4D Result Predict

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Scott Rogowski

Scott Rogowski

Author of mongita & code2flow. Working on FFER & fastmap.

More from Medium

DataUnion Foundation Monthly Update: May 2022

Joint AI Ventures for Midsize Companies in Canada

SEO for blogs: Artificial Intelligence, AI, Machine Learning, ML, Data, Data Science, Analytics, Big Data, Consulting, Small Business, algorithms, predictive analytics, deep learning, neural networks, natural language processing, internet of things, data mining, business

How to maximize the value of merchant batteries in the spot market?

Our Algorithmic World