Crowdsourcing has brought us Wikipedia and ways to understand how HIV proteins fold. It also provides an increasingly effective means for teams to write software, perform research or accomplish small repetitive digital tasks. However, most studies have proven resistant to distributed labor without a central organizer. As in the case of Wikipedia, their success often relies on the efforts of a small cadre of dedicated volunteers. If these individuals move on, the project becomes difficult to sustain. Scientists funded by the National Science Foundation (NSF) are finding new solutions to these challenges.
Aniket Kittur, an associate professor in the Human-Computer Interaction Institute at Carnegie Mellon University (CMU), designs crowdsourcing frameworks combining machine learning and human intelligence to allow distributed workers to perform complicated cognitive tasks. These include writing how-to guides or organizing information without a central organizer.
At the Computer-Human Interaction conference in Chicago this week, Kittur and his collaborators Nathan Hahn and Joseph Chang (CMU), and Ji Eun Kim (Bosch Corporate Research), will present two prototype systems that enable teams of volunteers, bolstered by machine learning algorithms, to crowdsource more complex intellectual tasks with greater speed and accuracy (and at a lower cost) than past systems. The Knowledge Accelerator uses a machine-learning program to sort and organize information. “We are trying to scale up human thinking by letting people build on the work that others have done before them,“ Kittur said.
The Knowledge Accelerator
One prototype software Kittur and his collaborators developed the Knowledge Accelerator, empowers distributed workers to perform information synthesis. The software combines materials from a variety of sources. It constructs articles that can answer commonly sought questions like: “How do I get my tomato plant to produce more tomatoes?“ or “How do I unclog my bathtub drain?“
Related Articles :
- What a Pixar open-source project says about your software strategy
- Chevy Bolt To Feature Over-The-Air Software Updates
- Software issues block equal-intercourse marriage registration in Germany
- The world’s largest pyramid is hidden under a mountain in Mexico
- Here’s Why Goldman Sachs Is Handing Out This Hugely Precious Software
To assemble answers, individuals identify high-value sources from the Internet, extract useful information, cluster clips into commonly discussed topics, and identify illustrative images or videos. With the Knowledge Accelerator, each crowd worker contributes a small amount of effort to synthesize online information to answer complex or open-ended questions without an overseer or moderator.
The researchers’ challenge lies in designing a system that can divide assignments into short microtasks, each paying crowd workers $1 for 5-10 minutes of work. The system then must combine that information to maintain the article flow and cohesion, as if a single author wrote it. The researchers showed that their method produced articles judged by crowd workers as more useful than pages in the top five Google results from a given query. Experts or professional writers typically create those top Google results. “Overall, we believe this is a step towards a future of big thinking in small pieces, where complex thinking can be scaled beyond individual limits by massively distributing it across individuals,” the authors concluded.
Kittur and his team tackled a related problem of clustering — pulling out the patterns or themes among documents to organize information, whether Internet searches, academic research articles, or consumer product reviews. Machine learning systems have proven successful at automating aspects of this work. However, their inability to understand distinctions in meaning among similar documents and topics means that humans are still better at the task. However, when human judgment is used in crowdsourcing, individuals often miss the full context that allows them to do the job effectively.
The new Alloy system combines human intelligence and machine learning to speed up clustering using a two-step process. In the first step, crowd workers identify meaningful categories and provide representative examples, which the machine uses to cluster many topics or documents. However, not every document can be easily classified, so in the second step; humans consider those documents that the machines couldn’t cluster well, providing additional information and insights.
The study found that Alloy, using the two-step process, achieved better performance at a lower cost than previous crowd-based approaches. The framework, researchers say, could be adapted for other tasks, such as image clustering or real-time video event detection. “The key challenge here is trying to build a big picture view when each person can only see a small piece of the whole,“ Kittur said. “We tackle this by giving workers new ways to see more context and by stitching together each worker’s view with a flexible machine-learning backbone.“
On the path to knowledge
Kittur is conducting his research under an NSF Faculty Early Career Development (CAREER) award he received in 2012. The award supports junior faculty who exemplify teacher-scholar roles through outstanding research, excellent education, and the integration of teaching and research within their organization’s mission. NSF is funding his work with $500,000 over five years. He says the work advances the understanding and design of crowdsourcing frameworks, which can be applied to various domains. “It has the potential to improve the efficiency of knowledge work, the training and practice of scientists, and the effectiveness of education,” Kittur says. “Our long-term goal is to produce a universal knowledge accelerator: capturing a fraction of the learning that every person engages in every day and making that benefit later people who can learn faster and more deeply than ever before.“