现在的位置: 首页 > 综合 > 正文

信息检索中有用的开发包—–持续更新中

2018年02月20日 ⁄ 综合 ⁄ 共 6543字 ⁄ 字号 评论关闭

Word Vector Tool

http://sourceforge.jp/projects/sfnet_wvtool/

The Word Vector Tool is a simple but flexible Java library to create word vector representations of text documents. Word vectors
can be used for various text processing tasks, as text classification, text clustering or information retrieval.


http://code.google.com/p/fudannlp/

功能(Functions)

  1. 信息检索: 文本分类 新闻聚类
  2. 中文处理: 中文分词 词性标注 实体名识别 关键词抽取 依存句法分析 时间短语识别
  3. 结构化学习: 在线学习 层次分类 聚类 精确推理


Stanford NLP Chinese(中文)的使用

http://www.zhizhihu.com/html/y2011/3060.html


Mallet:自然语言处理工具包

http://www.zhizhihu.com/html/y2010/2199.html


http://blog.mashape.com/post/48946187179/20-natural-language-processing-apis

Natural Language Processing, or NLP, is a field of computer science, artificial intelligence, and linguistics concerned with the interactions between computers and human (natural) languages.

Here are useful APIs that help bridge the human-computer interaction:

 

  1. Text Processing - The WebKnox text processing API lets you process (natural)
    language texts. You can detect the text’s language, the quality of the writing, find entity mentions, tag part-of-speech, extract dates, extract locations, or determine the sentiment of the text.
  2. Question-Answering - The WebKnox question-answering API allows you to find
    answers to natural language questions. These questions can be factual such as “What is the capital of Australia” or more complex.
  3. Jeannie - Jeannie (Voice Actions) is a virtual assistant with over two Million downloads,
    now also available via API. The objective of this service is to provide you and your robot with the smartest answer to any natural language question, just like Siri.
  4. Diffbot - Diffbot extracts data from web pages automatically and returns structured
    JSON. For example, our Article API returns an article’s title, author, date and full-text. Use the web as your database! We use computer vision, machine learning and natural language processing to add structure to just about any web page.
  5. nlpTools - Text processing framework to analyse Natural Language. It is especially
    focused on text classification and sentiment analysis of online news media (general-purpose, multiple topics).
  6. Speech2Topics - Yactraq Speech2Topics is a cloud service that converts audiovisual
    content into topic metadata via speech recognition & natural language processing. Customers use Yactraq metadata to target ads, build UX features like content search/discovery and mine Youtube videos for brand sentiment.
  7. Stremor Automated Summary and Abstract Generator - Language
    Heuristics goes a step beyond Natural Language Processing to extract intent from text. Summaries are created through extraction, but maintain readability by keeping sentence dependencies intact.
  8. Repustate Sentiment and Social Media Analytics - Repustate’s
    sentiment analysis and social media analytics API allows you to extract key words and phrases and determine social media sentiment in one of many languages. These languages include English, Arabic, German, French and Spanish. Monitor social media as well using
    our API and retrieve your data all with simple API calls.
  9. Sentiment Analysis for Social Media - The multilingual
    sentiment analysis API (with exceptional accuracy, 83.4% as opposed to industry standard of 65.4%, and available in Mandarin) from Chatterbox classifies social media texts as positive or negative, with a free daily allowance to get you started. The system
    uses advanced statistical models (machine learning & NLP) trained on social data, meaning the detection can handle slang, common misspellings, emoticons, hashtags, etc.
  10. Skyttle 2.0 - Skyttle API extracts topical keywords (single words and multiword
    expressions) and sentiment (positive or negative) expressed in text. Languages supported are English, French, German, Russian.
  11. Text-Processing - Sentiment analysis, stemming and lemmatization, part-of-speech
    tagging and chunking, phrase extraction and named entity recognition.
  12. Stemmer - This API takes a paragraph and returns the text with each word stemmed using
    porter stemmer, snowball stemmer or UEA stemmer
  13. SpringSense Meaning Recognition - The fastest and most accurate
    Meaning Recognition (Word Sense Disambiguation) API in the world. Recognises any nouns in a body of text and allows you to provide a rich user-interface with meaning definitions.
  14. LanguageTool - Style and grammar checking / proofreading for more than 25 languages,
    including English, French, Polish, Spanish and German.
  15. DuckDuckGo - DuckDuckGo Zero-click Info includes topic summaries,
    categories, disambiguation, official sites, !bang redirects, definitions and more. You can use this API for many things, e.g. define people, places, things, words and concepts; provides direct links to other services (via !bang syntax); list related topics;
    and gives official sites when available
  16. Jetlore Semantic Text Processing - Semantic Text Processing API extracts
    named entities from English text, including social media posts, user comments, product reviews, picture captions, email content, news articles, and web pages. We guarantee exceptional accuracy of over 90% precision at over 60% recall. The API handles slang,
    common misspellings, understands hashtags, and auto-fetches embedded URLs making it ideal for processing any user-generated content and social media.

    ESA Semantic Relatedness - Calculates the semantic relatedness between pairs
    of text excerpts based on the likeness of their meaning or semantic content.

    AlchemyAPI - AlchemyAPI provides advanced cloud-based and on-premise text analysis
    infrastructure that eliminates the expense and difficulty of integrating natural language processing systems into your application, service, or data processing pipeline.

    Sentence Recognition - The Sentence Recognition API will match strings of text
    based off of the meaning of the sentences. It’s powerful NLP engine offering utilizes a semantic network to understand the text presented.

    Machine Linking - We develop a multilingual SaaS platform performing semantic analysis
    of textual documents: by interfacing with our API, developers can connect unstructured documents, written in different languages, to resources in the Linked Open Data cloud such as DBPedia or Freebase

    TextTeaser - TextTeaser is an automatic summarization API. It extracts the most important
    sentences of an article. The purpose of the API is to provide a preview of what the article is all about.

    Textalytics Media Analytics - Textalytics Media Analysis API analyzes
    mentions, topics, opinions and facts in all types of media. This API provides services for: - Sentiment analysis - Extracts positive and negative opinions according to the context. - Entities extraction - Identifies persons, companies, brands, products, etc.
    and provides a canonical form that unifies different mentions (IBM, International Business Machines Corporation, etc.) - Topic and keyword extraction - Facts and other key information - Dates, URLs, addresses, user names, e-mails and money amounts. - Thematic
    classification - Organize information by topic using IPTC standard classification (more than 200 categories hierarchically structured). - Configured for different type of media: microblogging and social networks, blogs and news

    Wit.ai - it enables developers to add a Siri-like modern natural
    language interface to their app or device with minimal effort. It integrates well with Android’s speech to text engine.

You should also check out our other useful API lists for machine learningsummarizing
text
sentiment analysisSMS APIs, and face recognition
APIs
.

抱歉!评论已关闭.