Austin AI offers pre-packaged software solutions for specific use cases.

These frameworks are never blindly applied to nor obfuscated from clients, but rather integrated into their own infrastructure using the appropriate customization and attention to detail.

Contact Us

ScrapeAI

Create your own Alternative Data sets from the internet

Companies of all sizes are increasingly turning to Alternative Data sets (“Alt Data”) to augment their data analytics, AI and ML efforts. Alt Data is information previously not collected or parsed into a neat format and possibly not available for sale. It may be aggregated from the internet or by hand.

Other Alt Data providers either aggregate data in their own database and sell it back to you, or write simple scripts and hand over Excel files.  We install our scraping platform on your own infrastructure with proper database design, QA the results, provide API access to it, customize it, provide a powerful visualization tool and/or custom GUI, and apply data cleaning, AI or ML to enhance the results.  The platform runs in real-time once in production.  The result is a superior, ready-to-use product which puts the user in total control with minimal effort.

01

Data sources we have worked with before include:

  • Mainstream and niche news sites
  • The U.S. Census
  • Many property- and real estate-related data sources 
  • Many financial-related data sources 
  • Many industrial and industry-specific data sources
  • Retail websites / shopping
  • Social media
  • YouTube and other user generated content sites
  • Weather sites
02

Data enhancements we have performed before include:

  • Many cleaning steps such as imputation of null values, data validation, and cross-referencing
  • Many statistical transforms like rates of change, cross-sectional or time series statistics, etc.
  • Custom sentiment models, customized per industry and use case
  • Term, word and/or emoji frequency (absolute or relative) 
  • Recognition and categorization of named entities, subjects, and custom phrases
  • Conversion of audio to text
  • Image processing and recognition 
  • Other custom AI and ML models
03

Features of the platform:

  • Accessible and open SQL and Elasticsearch backend 
  • REST API provided for internal and/or external use
  • Installs on Win or Linux on any cloud provider (AWS and Win preferred)
  • Intelligent IP obfuscation under the hood 
  • Ability to ingest different data types: text, audio, video, PDF
  • Unit testing to ensure data processes work reliably
  • Integrates with Slack and email for notifications
  • Admin tool for scheduling and monitoring

Pricing

Initial Setup
$20K
Per Data Source
$10K
Per Post Process Module
$10K
Maintenance
$1K
*May vary depending on the scopes of the data source and post process module. Extra charges may apply for large volumes of scraping or audio transcription

Ready to get started?

Contact us for a no-cost assessment which includes a consultative discussion on business needs, an evaluation of data readiness, and initial modeling.