正文抽取算法

精小正文提取工具

项目背景

		
		from vampire.htmlextract import HtmlExtract
		import requests
		html = requests.get('http://www.fabao365.com/fangchan/167193/')  
		html.encoding="utf-8"
		ex = HtmlExtract()
		print ex.get_text(html.text)

参考资料

Fork me on GitHub