警醒 大数据是如何赚钱和亏钱的?
译者: scv123 原作者:Marco Visibelli, Kuldat 转自译言网
步子迈太大: 大数据并不需要一笔巨大的预算,如果怀着巨大的投入将带来巨大回报的预期开始一个大数据项目,往往会产生问题。在正式开始前,明智的做法是,尝试用有限的投入,在小范围内测试这个技术是否确实能带来预期的收益。按这样的节奏,一个项目可以按部就班地随着收益逐步提高,而逐步扩大投入规模,确保收益始终大于投入。
低估人力投入 :在开始实施一个大数据系统前,问自己一个简单的问题:这个项目是否可以不需要持续的人工支持来运作?如果答案是,需要人工支持,那么建议停止项目。建立这样一个项目往往意味着百万级的损失,无法在有利润情况下保持维护和运行。
迷信自然语言处理: 大数据有个经常听到的功能是,通过自然语言处理,将各种领域的各种数据处理成直接可读可理解的形式。这听起来确实很赞,但是在实际应用中,往往不尽如人意。自然语言处理仍然存在许多妨碍应用的限制,主要由于人工智能的发展还不够--而且在可见的10年内,这个情况可能不会有很大改观。
作者Marco Visibelli是一位曾经工作于IBM,后离职创建Kuldat的数据科学家,他的公司主营运用大数据来为销售和市场分析潜在获益机会。
BIG DATA IS enjoying a big moment in the sun. But who stands to benefit most from this technology — and how?
After working to implement early Big Data projects in industries like telecommunications and investment banking over the last decade, I have concluded this emerging technology can best be harnessed to gain a more precise understanding of complex systems like stock markets and supply chains. (It’s not surprising that investment banks, in particular, have been amongst the first to adopt Big Data analytics. After all, executives whose business is making money are usually keenest to use technology to save and create wealth.)
In investment banking, the required amount of documents (news, balance sheets, etc.) to accurately recommend investment or stock-purchasing behaviors is too great to process manually. So associates tend to simplify their assumptions and use spreadsheet files for most of their work. But the availability of big data technology to process vast quantities of information can reduce these risks and empower companies to make better analysis and predictions than ever before.
How Companies Make Money With Big Data
With a Big Data platform, stock market traders and investment portfolio managers can process vast amounts of unstructured data to identify the best companies in which to invest.
Unstructured public information like company news, product reviews, supplier data and price list change can be processed en masse as Big Data, producing mathematical models that help traders decide which stock to buy or sell.
Some businesses that use Big Data for investment forecasting in this way tend to mitigate the upfront costs of their projects by using cloud services like Amazon Web Services, starting with a small group of servers and scaling up when they became profitable. I know of one quantitative analyst who, after quitting his job from a major investment bank, was able to create a profitable Big Data trading system in less than six months with a very modest investment.
Even in the manufacturing sector, forecasting can be upgraded by using Big Data. A major European car manufacturer I consulted for created an internal system to gain actionable analytics on the cost of steel, helping it identify the optimal time to purchase raw materials for a better price. Created with the open-source Java framework Hadoop, the system was able to combine several supplier databases with a total 15Tb of information, saving the company $16 million in two years.
That project was a success for two reasons: the company had enough information to model all the suppliers and the program saved more money than the system cost to implement.
2. How Companies Lose Money With Big Data
But not every Big Data project succeeds in this way. Sometimes companies lose money on Big Data projects as often as they gain it. Early symptoms of Big Data failure in the making vary, but the most common problems are:
Starting too big: Big Data doesn’t need a big budget. If you embark on a project in the belief that a big investment will equal a big return, something is wrong. Before starting, it is wise to analyze whether a limited spend on this technology will really give desired benefits on a small scale. If so, a project can always be subsequently scaled up to ensure economies of scale add up to bigger gains.
Underestimating human labor requirements: Before starting to implement a system, ask yourself a simple question: can your Big Data project work without constant human support? If the answer is “no”, then stop. You stand to lose millions trying to build a system that is impossible to maintain in a profitable way.
Trying to the push to the limits of natural language processing: One of Big Data’s oft-hailed promises is turning copious fields of data into readable narrative using natural language processing (NLP). The idea is exciting – but the reality, for companies trying to do this today, is often underwhelming. Natural language processing today has severe limitations because artificial intelligence is not yet advanced enough – and may not be for another 10 years.
Modern Big Data has the potential to bring cost savings that would make data handlers of yesteryear marvel as though it were magic. But don’t commit your time and resources without first establishing whether your project will really be profitable. Only fools rush in.
Marco Visibelli is a data scientist who worked for IBM before founding Kuldat, a big data application companies use to gain useful sales and marketing insights, analyze their feasibility, and present possible outcomes.