Being approximately right at the right moment is better than being accurate after the moment has passed.
This is a fun weekend project for someone interested in doing a data science project. A problem I’ve been grappling with is what, how, and how much to write on the website of the SaaS product. For a SaaS company, the website is pretty much its sales copy. It is interesting and relevant to know how the market, and especially the competitors, present their sales copy.
Let’s start with “How much” to write. One way to quantify this is the number of words and images in each of the pages. Are fewer words ideal or should the sales copy be more? What is the industry standard currently? You could A/B test different copy lengths and see which converts more. But having a data-centric view on the current landscape before proceeding down this path might be prudent.
Here’s what I have in mind.
Scrape several popular SaaS companies’ websites. Most of them have similar templates. *A home page. Product page. About us. Pricing.* With a bit of cleaning, it should be easy to get some aggregate metrics and publish the results in a dashboard.
My technical stack would look something like this:
jupyter notebookas the coding environment and
pythonas the coding language.
- Google Colab for development.
- Scrape websites using
hugging faceto do text processing
plotlyto build the dashboard
streamlitto host the application
- Deploy this on
- Code repo:
bitbucketif you are already on it)
- If you want to go all fancy, you could use
fastAPIto build and deploy the app. But that will require coding the front-end (so, you will need to know
CSS- I am not going to recommend it for a weekend hack).
All of the above services are available for free. The only advice and word of caution I have are to be respectful of the robots.txt file in each of the sites.
I am not sure if there is a micro-SaaS that can be built out of this. Maybe there is - but I am skeptical. One could get all fancy and use more detailed NLP models from
hugging face to extract a variety of attributes to build insights on. But that’s going to take more than a weekend to do.
I am happy to help and mentor if someone is interested in building this as a weekend project.
I typically post on startups and Machine Learning. Please follow me on twitter for more articles.