最新动态

  • 2024-01-15: 共6篇文章被国际顶级会议ICLR-2024接收。
  • 2023-12-12: 论文"Bridging Continuous and Discrete Spaces: Interpretable Sentence Representation Learning via Compositional Operations"和"IfQA: A Dataset for Open-domain Question Answering under Counterfactual Presuppositions"被国际顶级会议EMNLP-2023评为杰出论文。
  • 2023-10-06: 共25篇文章(15篇长文,9篇Findings长文,1篇Findings短文)被国际顶级会议EMNLP-2023接收。
  • 2023-05-01: 共21篇文章(7篇长文,1篇短文,12篇Findings长文,1篇Demonstraction)被国际顶级会议ACL-2023接收。
  • 2022-11-23: 论文"Adapters for Enhanced Modeling of Multilingual Knowledge and Text"被国际顶级会议EMNLP-2022的MRL研讨会评为最佳论文。
  • 2022-10-06: 共24篇文章(14篇长文,10篇Findings长文)被国际顶级会议EMNLP-2022接收。
  • 2022-09-26: 在国际机器翻译大赛WMT2022上,我们在3项任务中获得第一。
  • 2022-09-15: 发布覆盖六百万英文词语的词向量数据。
  • 2022-04-07: 发布智能创作助手Effidit。
  • 2022-02-24: 共20篇文章(13篇长文,2篇短文,3篇Findings长文,2篇Findings短文)被国际顶级会议ACL-2022接收。
  • 2021-09-16: 在国际机器翻译大赛WMT2021上,我们与腾讯智能平台产品部的联合团队在5项任务中获得第一。
  • 2021-08-27: 共13篇文章(8篇长文,4篇Findings长文,1篇Findings短文)被国际顶级会议EMNLP-2021接收。
  • 2021-07-05: 论文"Neural Machine Translation with Monolingual Translation Memory"被国际顶级会议ACL-2021评为六篇杰出论文之一。
  • 2021-05-27: 论文"Video-aided Unsupervised Grammar Induction"被国际顶级会议NAACL-2021评为最佳长文。
  • 2021-05-08: 共24篇文章(15篇长文,1篇短文,6篇Findings长文,2篇Findings短文)被国际顶级会议ACL-2021接收。
  • 2020-11-01: TexSmart文本理解系统荣获第十九届中国计算语言学大会(CCL 2020)最佳系统展示奖。
  • 2020-09-29: 在国际机器翻译大赛WMT2020上,我们与腾讯智能平台产品部的联合团队在1项任务中获得第一,3项任务中获得第二。
  • 2020-09-14: 共15篇文章(9篇长文,1篇短文,3篇Findings长文,2篇Findings短文)被国际顶级会议EMNLP-2020接收。
  • 2020-04-15: 发布文本理解系统TexSmart。
  • 2020-04-04: 共20篇文章(16篇长文,4篇短文)被国际顶级会议ACL-2020接收。
  • 2019-08-14: 共17篇文章(14篇长文,3篇短文)被国际顶级会议EMNLP-2019接收。
  • 2019-06-14: 员工王龙跃的博士论文《篇章级神经网络机器翻译》被评为EAMT 2018最佳博士论文奖,主体内容为在腾讯人工智能实验室实习时完成。
  • 2019-05-15: 共18篇文章(12篇长文,6篇短文)被国际顶级会议ACL-2019接收。
  • 2018-12-30: 在ARC-Easy, ARC-Challenge, 和OpenbookQA等榜单中取得好成绩。
  • 2018-11-13: 发布交互式翻译系统 -- TranSmart。
  • 2018-10-19: 发布覆盖八百万中文词语的词向量数据。
  • 技术方向

      文本理解: 多粒度文本表示学习,基于知识的语义理解。 文本生成: 覆盖摘要、风格转换、知识问答、多模态句子生成等多个方向。 智能对话: 旨在构建更智能的人机对话系统,尤其是在开放域对话任务上。 机器翻译: 通过两种途径推动翻译水平从业余到专业:1) 开展自动翻译基础研究,进一步提升模型效果;2) 推动人机交互机器翻译的研究和系统构建,提升译者的效率和翻译质量。

    系统&服务&工具包&开源数据

    精选系统

      智能创作助手Effidit: 一个研究性原型系统,探索用 AI 技术提升写作者的写作效率和创作体验。它的主要功能包括文本补全、文本纠错、文本润色、K2S(基于关键词的句子推荐与生成)、云输入法等。其中文本补全包括短语补全、检索式句子补全、AI句子续写,文本润色包括短语润色、文本改写、文本扩写等。 [在线演示(Demo)|功能详解] 文本理解系统TexSmart: 自然语言理解工具与服务, 用以对中文和英文两种语言的文本进行词法、句法和语义分析。 除了支持分词、词性标注、命名实体识别(NER)、句法分析、语义角色标注等常见功能外, TexSmart还提供细粒度命名实体识别、语义联想、深度语义表达等特色功能。 [在线演示(Demo) | 线上HTTP API | 下载工具包 ] 交互翻译系统TranSmart: 基于人机交互的翻译服务,旨在提高人工翻译效率和降低客户翻译成本。 [Demo链接 | API (稍后开放)]

    精选开源数据

      大规模词向量(中文、英文): 提供在大规模语料上训练的、包含800万中文词汇和650万英文词汇的词向量数据,向量维度包括100和200维。

    精选论文

    全部列表
      Songyang Zhang, Linfeng Song, Lifeng Jin, Kun Xu, Dong Yu, and Jiebo Luo. Video-aided Unsupervised Grammar Induction. NAACL 2021. [Best Paper] Deng Cai, Yan Wang, Huayang Li, Wai Lam, and Lemao Liu. Neural Machine Translation with Monolingual Translation Memory. ACL 2021. [Outstanding Paper] Yifan Hou, Wenxiang Jiao, Meizhen Liu, Carl Allen, Zhaopeng Tu, and Mrinmaya Sachan. Adapters for Enhanced Modeling of Multilingual Knowledge and Text. EMNLP 2022 (Findings). [Best Paper of MRL Workshop] Fan Yang, Wenchuan Wang, Fang Wang, Yuan Fang, Duyu Tang, Junzhou Huang, Hui Lu, and Jianhua Yao. scBERT as a Large-Scale Pretrained Deep Language Model for Cell Type Annotation of Single-Cell RNA-Seq Data. Nature Machine Intelligence (2022). James Y. Huang, Wenlin Yao, Kaiqiang Song, Hongming Zhang, Muhao Chen, Dong Yu. Bridging Continuous and Discrete Spaces: Interpretable Sentence Representation Learning via Compositional Operations. EMNLP 2023. [Outstanding Paper] Wenhao Yu, Meng Jiang, Peter Clark, and Ashish Sabharwal. IfQA: A Dataset for Open-domain Question Answering under Counterfactual Presuppositions. EMNLP 2023. [Outstanding Paper] Wenxiang Jiao, Wenxuan Wang, Jen-tse Huang, Xing Wang, and Zhaopeng Tu. Is ChatGPT A Good Translator? A Preliminary Study. Preprint 2023. [Citations: 333] Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao, Yu Zhang, Yulong Chen, Longyue Wang, Anh Tuan Luu, Wei Bi, Freda Shi, and Shuming Shi. Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models. Preprint 2023. [Citations: 83] Tian Liang, Zhiwei He, Wenxiang Jiao, Xing Wang, Yan Wang, Rui Wang, Yujiu Yang, Zhaopeng Tu, and Shuming Shi. Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate. Preprint 2023. [Citations: 44] Longyue Wang, Chenyang Lyu, Tianbo Ji, Zhirui Zhang, Dian Yu, Shuming Shi, and Zhaopeng Tu. Document-Level Machine Translation with Large Language Models. EMNLP 2023. [Citations: 33] Cunxiao Du, Zhaopeng Tu, and Jing Jiang. Order-Agnostic Cross Entropy for Non-Autoregressive Machine Translation. ICML 2021. [Citations: 62] Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, Dacheng Tao, and Zhaopeng Tu. Understanding and Improving Lexical Choice in Non-Autoregressive Translation. ICLR 2021. [Citations: 62] Yangming Li, Lemao Liu, and Shuming Shi. Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition. ICLR 2021. [Citations: 64] Dian Yu, Kai Sun, Claire Cardie, and Dong Yu. Dialogue-Based Relation Extraction. ACL 2020. [Citations: 115] Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, and William Yang Wang. Logical Natural Language Generation from Open-Domain Tables. ACL 2020. [Citations: 125] Haoyu Song, Yan Wang, Wei-Nan Zhang, Xiaojiang Liu, and Ting Liu. Generate, Delete and Rewrite: A Three-Stage Framework for Improving Persona Consistency of Dialogue Generation. ACL 2020. [Citations: 77] Zhenyi Wang, Xiaoyang Wang, Bang An, Dong Yu, and Changyou Chen. Towards Faithful Neural Table-to-Text Generation with Content-Matching Constraints. ACL 2020. [Citations: 73] Shuo Wang, Zhaopeng Tu, Shuming Shi, and Yang Liu. On the Inference Calibration of Neural Machine Translation. ACL 2020. [Citations: 63] Qile Zhu, Wei Bi, Xiaojiang Liu, Xiyao Ma, Xiaolin Li, and Dapeng Wu. A Batch Normalized Inference Network Keeps the KL Vanishing Away. ACL 2020. [Citations: 64] Jian Liu, Yubo Chen, Kang Liu, Wei Bi, and Xiaojiang Liu. Event Extraction as Machine Reading Comprehension. EMNLP 2020. [Citations: 250] Yiwu Zhong, Liwei Wang, Jianshu Chen, Dong Yu, and Yin Li. Comprehensive Image Captioning via Scene Graph Decomposition. ECCV 2020. [Citations: 109] Wenhu Chen, Jianshu Chen, Pengda Qin, Xifeng Yan, and William Yang Wang. Semantically Conditioned Dialog Response Generation via Hierarchical Disentangled Self-Attention. ACL 2019. [Citations: 137] Xintong Li, Guanlin Li, Lemao Liu, Max Meng, and Shuming Shi. On the Word Alignment from Neural Machine Translation. ACL 2019. [Citations: 73] Yue Wang, Jing Li, Hou Pong Chan, Irwin King, Michael R. Lyu, and Shuming Shi. Topic-Aware Neural Keyphrase Generation for Social Media Language. ACL 2019. [Citations: 88] Kun Xu, Liwei Wang, Mo Yu, Yansong Feng, Yan Song, Zhiguo Wang, and Dong Yu. Cross-lingual Knowledge Graph Alignment via Graph Matching Neural Network. ACL 2019 (Short). [Citations: 235] Kai Sun, Dian Yu, Jianshu Chen, Dong Yu, Yejin Choi, and Claire Cardie. DREAM: A Challenge Dataset and Models for Dialogue-Based Reading Comprehension. TACL 2019. [Citations: 252] Xing Wang, Zhaopeng Tu, Longyue Wang, and Shuming Shi. Self-Attention Networks with Structural Position Representations. EMNLP 2019 (Short). [Citations: 71] Deng Cai, Yan Wang, Wei Bi, Zhaopeng Tu, Xiaojiang Liu, and Shuming Shi. Retrieval-Guided Dialogue Response Generation via a Matching-to-Generation Framework. EMNLP 2019. [Citations: 75] Kai Sun, Dian Yu, Dong Yu, and Claire Cardie. Improving Machine Reading Comprehension with General Reading Comprehension. NAACL 2019. [Citations: 124] Jie Hao, Xing Wang, Baosong Yang, Longyue Wang, Jinfeng Zhang, and Zhaopeng Tu. Modeling Recurrence for Transformer. NAACL 2019. [Citations: 82] Deng Cai, Yan Wang, Wei Bi, Zhaopeng Tu, Xiaojiang Liu, Wai Lam, and Shuming Shi. Skeleton-to-Response: Dialogue Generation Guided by Retrieval Memory. NAACL 2019. [Citations: 75] Baosong Yang, Longyue Wang, Derek Wong, Lidia S. Chao, and Zhaopeng Tu. Convolutional Self-Attention Networks. NAACL 2019 (Short). [Citations: 127] Peifeng Wang, Jialong Han, Chenliang Li, and Rong Pan. Logic Attention Based Neighborhood Aggregation for Inductive Knowledge Graph Embedding. AAAI 2019. [Citations: 118] Jiaao Chen, Jianshu Chen, and Zhou Yu. Incorporating Commonsense Knowledge for Story Completion. AAAI 2019. [Citations: 68] Jun Gao, Wei Bi, Xiaojiang Liu, Junhui Li, and Shuming Shi. Generating Multiple Diverse Responses for Short-Text Conversation. AAAI 2019. [Citations: 56] Xin Li, Lidong Bing, Wai Lam, and Bei Shi. Transformation Networks for Target-Oriented Sentiment Classification. ACL 2018. [Citations: 495] Yong Cheng, Zhaopeng Tu, Fandong Meng, Junjie Zhai, and Yang Liu. Towards Robust Neural Machine Translation. ACL 2018. [Citations: 166] Zhaopeng Tu, Yang Liu, Shuming Shi, and Tong Zhang. Learning to Remember Translation History with a Continuous Cache. TACL 2018. [Citations: 187] Baosong Yang, Zhaopeng Tu, Derek F. Wong, Fandong Meng, Lidia S. Chao, and Tong Zhang. Modeling Localness for Self-Attention Networks. EMNLP 2018. [Citations: 187] Jichuan Zeng, Jing Li, Yan Song, Cuiyun Gao, Michael R. Lyu, and Irwin King. Topic Memory Networks for Short Text Classification. EMNLP 2018. [Citations: 156] Ziyi Dou, Zhaopeng Tu, Xing Wang, Shuming Shi, and Tong Zhang. Exploiting Deep Representations for Neural Machine Translation. EMNLP 2018. [Citations: 88] Dingmin Wang, Yan Song, Jing Li, Jialong Han, and Haisong Zhang. A Hybrid Approach to Automatic Corpus Generation for Chinese Spelling Check. EMNLP 2018. [Citations: 118] Juntao Li, Yan Song, Haisong Zhang, Dongmin Chen, Shuming Shi, and Rui Yan. Generating Classical Chinese Poems via Conditional Variational Autoencoder and Adversarial Training. EMNLP 2018. [Citations: 66] Jian Li, Zhaopeng Tu, Baosong Yang, Michael R. Lyu, and Tong Zhang. Multi-Head Attention with Disagreement Regularization. EMNLP 2018 (Short). [Citations: 162] Lei Wang, Yan Wang, Deng Cai, Dongxiang Zhang, and Xiaojiang Liu. Translating Math Word Problem to Expression Tree. EMNLP 2018 (Short). [Citations: 156] Yan Song, Shuming Shi, Jing Li, and Haisong Zhang. Directional Skip-Gram: Explicitly Distinguishing Left and Right Context forWord Embeddings. NAACL 2018 (Short). [Citations: 364] Xin Li, Lidong Bing, Piji Li, Wai Lam, and Zhimou Yang. Aspect Term Extraction with History Attention and Selective Transformation. IJCAI 2018. [Citations: 243] Yan Wang, Xiaojiang Liu, and Shuming Shi. Deep Neural Solver for Math Word Problems. EMNLP 2017. [Citations: 303] Danqing Huang, Shuming Shi, Jian Yin, and Chin-Yew Lin. Learning Fine-Grained Expressions to Solve Math Word Problems. EMNLP 2017. [Citations: 93] Xing Wang, Zhaopeng Tu, Deyi Xiong, and Min Zhang. Translating Phrases in Neural Machine Translation. EMNLP 2017. [Citations: 68] Longyue Wang, Zhaopeng Tu, Andy Way, and Qun Liu. Exploiting Cross-Sentence Context for Neural Machine Translation. EMNLP 2017 (Short). [Citations: 220]