NAME Text::MeCab - Alternate Interface To libmecab SYNOPSIS use Text::MeCab; my $mecab = Text::MeCab->new({ rcfile => $rcfile, dicdir => $dicdir, userdic => $userdic, lattice_level => $lattice_level, all_morphs => $all_morphs, output_format_type => $output_format_type, partial => $partial, node_format => $node_format, unk_format => $unk_format, bos_format => $bos_format, eos_format => $eos_format, input_buffer_size => $input_buffer_size, allocate_sentence => $allocate_sentence, nbest => $nbest, theta => $theta, }); for (my $node = $mecab->parse($text); $node; $node = $node->next) { # See perdoc for Text::MeCab::Node for list of methods print $node->surface, "\n"; } # use constants use Text::MeCab qw(:all); use Text::MeCab qw(MECAB_NOR_NODE); # want to use a command line arguments? my $mecab = Text::MeCab->new("--userdic=/foo/bar/baz", "-P"); # check what mecab version we compiled against? print "Compiled with ", &Text::MeCab::MECAB_VERSION, "\n"; DESCRIPTION libmecab (http://mecab.sourceforge.ne.jp) already has a perl interface built with it, so why a new module? I just feel that while a subtle difference, making the perl interface through a tied hash is just... weird. So Text::MeCab gives you a more natural, Perl-ish way to access libmecab! WARNING: Please note that this module is primarily targetted for libmecab >= 0.90, so if things seem to be broken and your libmecab version is below 0.90, then you might want to consider upgrading libmecab first. METHODS new HASHREF | LIST Creates a new Text::MeCab instance. You can either specify a hashref and use named parameters, or you can use the exact command line arguments that the mecab command accepts. Below is the list of accepted named options. See the man page for mecab for details about each option. rcfile dicdir lattice_level all_morphs output_format_type partial node_format unk_format bos_format eos_format input_buffer_size allocate_sentence nbest theta parse SCALAR Parses the given text via mecab, and returns a Text::MeCab::Node object. NOTES ABOUT PARSED STRUCTURE Please note that Text::MeCab::parse() creates Text::MeCab::Node objects that are "detatched" from libmecab Tagger. This is to allow these Perl-ish idioms: my $node; { my $mecab = Text::MeCab->new; $node = $mecab->parse($text); # $mecab goes out of scope } for(; $node; $node = $node->next) { print $node->surface, "\n"; } and, my $mecab = Text::MeCab->new; my $node_A = $mecab->parse($text_A); my $node_B = $mecab->parse($text_B); If we are to use the mecab nodes directly we would have to carefully control the scope between the mecab tagger object and the nodes. Since this is perl, I chose maniplexity over efficiency. Let me know if there are problems with this approach. SEE ALSO http://mecab.sourceforge.ne.jp AUTHOR (c) 2006 Daisuke Maki All rights reserved.