前几天接收到百度邮件推送,百度AI语言处理基础技术接口开始永久免费了.
打开网页仔细看了一下,突然感觉百度公司知道该做对的事情了.
百度AI开发者网址:http://ai.baidu.com/tech/nlp?hmsr=developeredm&hmpl=NLP_EDM
比较吸引我的几个接口:
1.根据文章标题内容分析关键词
2.根据文章标题内容获取板块分类
3.判断两个短文本的相似度(可用作智能推荐)
下面简单说一下使用流程:
1.获取accessToken
这个官方文档有范例,我稍微改进了一下.
环境一:Json正反序列化使用的Nuget插件LitJson
环境二:工程需添加引用System.Net.Http
using LitJson;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net.Http;
using System.Web;
namespace com.baidu.ai
{
public static class BaiduAIAccess
{
// 百度云中开通对应服务应用的 API Key 建议开通应用的时候多选服务
private static String clientId = "2nXP88tZlViNQGiZddHPUC7W";
// 百度云中开通对应服务应用的 Secret Key
private static String clientSecret = "3lM79E9uiuYiuhrTBpNWlTW92k1yX8PX";
public static BaiduAIAccessTokenResult GetAccessToken()
{
String authHost = "https://aip.baidubce.com/oauth/2.0/token";
HttpClient client = new HttpClient();
List<keyvaluepair<string, string="">> paraList = new List<keyvaluepair<string, string="">>();
paraList.Add(new KeyValuePair<string, string="">("grant_type", "client_credentials"));
paraList.Add(new KeyValuePair<string, string="">("client_id", clientId));
paraList.Add(new KeyValuePair<string, string="">("client_secret", clientSecret));
HttpResponseMessage response = client.PostAsync(authHost, new FormUrlEncodedContent(paraList)).Result;
String result = response.Content.ReadAsStringAsync().Result;
BaiduAIAccessTokenResult accessTokenResult = JsonMapper.ToObject<baiduaiaccesstokenresult>(result);
return accessTokenResult;
}
}
public class BaiduAIAccessTokenResult
{
public string access_token { get; set; }
public string session_key { get; set; }
public string scope { get; set; }
public string refresh_token { get; set; }
public string session_secret { get; set; }
public int expires_in { get; set; }
}
}
2.调用接口,这里只写获取关键词的那个.
using cloud0.Models;
using LitJson;
using System;
using System.Collections.Generic;
using System.IO;
using System.Net;
using System.Net.Http;
using System.Text;
namespace com.baidu.ai
{
public class BaiduAIUtilization
{
//获取关键词
public static string url_keyword = "https://aip.baidubce.com/rpc/2.0/nlp/v1/keyword";
public static string url_topic = "https://aip.baidubce.com/rpc/2.0/nlp/v1/topic";
//获取关键词
public static KeywordsResult GetBlogTags(string title_para, string content_para, string token)
{
string hostUrl = url_keyword + "?charset=UTF-8&access_token=" + token;
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(hostUrl);
request.Method = "Post";
request.ContentType = "application/json;charset=UTF-8";
using (var streamWriter = new StreamWriter(request.GetRequestStream()))
{
var jsonObject = new
{
title = title_para,
content = content_para
};
string json = JsonMapper.ToJson(jsonObject);
streamWriter.Write(json);
}
string jsonString;
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
{
Stream myResponseStream = response.GetResponseStream();
using (StreamReader myStreamReader = new StreamReader(myResponseStream, Encoding.UTF8))
{
jsonString = myStreamReader.ReadToEnd();
}
}
return JsonMapper.ToObject<keywordsresult>(jsonString);
}
}
}
其中,KeywordsResult是自己根据json返回对象建立的类,如下:
//keyword
public class KeywordsResult
{
public long log_id { get; set; }
public Item[] items { get; set; }
public class Item
{
public double score { get; set; }
public string tag { get; set; }
}
}
使用方式调用静态GetBlogTags()方法即可,参数分别是标题,内容以及token.使用的时候自己前往baiduAI开发者网站注册获取用于得到token的app_id,目前及以后都是免费的.
初步手动输入title和content的内容测试了一下,得出的结果,还是比较令人满意的.提供了这样免费的资源,为百度点个赞!
目前做了两个小接口,可以先行体验一下:
http://www.songshizhao.com/blog/blogs.asmx?op=GetBlogKeywords
http://www.songshizhao.com/blog/blogs.asmx?op=GetBlogTopics
进过体验,如果上传的文本中包含/t(tab)和/r换行符的时候,返回的结果很不理想.所以content除了获取纯文本,还应该替换/t/r
类似使用Replace("\r", "").Replace("\t", "");例如:puretext = HttpUtility.UrlDecode(puretext).Replace("\r", "").Replace("\t", "");
当然使用ajax方式调用百度AI应该是也是很方便的.这个以后再进行尝试.