You Call It Far: Hutter says BLT good idea done 20 years ago, cites baby steps

To: marcus.hutter@anu.edu.au Aug 27 at 11:00 AM

I would like to propose a research task of Blind Language Translation, and get your opinion.  The task would consist of automatically translating a large (100 terabyte or more) corpus of Spanish texts to English without utilizing any correspondence information (like aligned texts).  My understanding is the consensus of linguists says this should be impossible, however from a Big Data perspective, I disagree.  I'm sure you're aware of a lot of unsupervised learning NLP papers.

I think BLT task is a tangible path towards developing the features and frame representations needed for AI.  And I think BLT offers a framework for understanding intelligence, both human and otherwise, such as in Searle's Chinese Room.

It is tantalizing to consider that a corpus with sufficient semantic breadth, depth, and redundancy may "objectively" encode information.  However even if a BLT task succeeded it may be considered to have implicitly cheated.  I hope this raises questions of "objectivity", humanity, and even (Alien) Civilization Models.



From: Marcus Hutter

> I would like to propose a research task of Blind Language Translation, and get your opinion.  The task would consist of automatically translating a large (100 terabyte or more) corpus of Spanish texts to English without utilizing any correspondence information (like aligned texts).

Good idea, but it has already been done 20 years ago
http://arxiv.org/abs/cmp-lg/9505037
http://dx.doi.org/10.3115/981658.981709
with lots of follow-up work
http://scholar.google.com.au/scholar?cites=5467964337065257077
I think Google translate uses this too.

And you're right to disagree :-)

You Call It Far

Sunday, August 30, 2015

Hutter says BLT good idea done 20 years ago, cites baby steps

No comments:

Post a Comment

About Me