Your product plans involve using data available on the web, but you can’t express the product unless you first acquire and structure that data.
You need to iterate quickly. However you can’t afford to be rewriting your data acquisition, processing, and indexing infrastructure every other week from scratch. By leveraging our expertise and the flexible but well-grounded Iūdex project, we can provide you an initial data set quickly, while setting a course for continuous improvements in coverage, update latency reduction, and overall data quality.
With the data in hand, we enable you to focus on your product vision.
While we use and contribute to many open source projects, we will invariably develop custom software to solve your particular needs. We are broadly capable engineers, but typically we provide custom software starting from our core competencies, including the design and implementation of systems for:
Acquisition of content via web crawling, feeds, APIs, and social streams.
Content preparation (articles, images), information extraction, and text processing.
Structured relational (e.g. PostgreSQL), document (Mongo), or full-text (Lucene, SOLR) databases, continuous indexing, and associated services.
Contact me at dek@<here> and we’ll set up an initial meeting to discuss your needs and how we might help.