# NAME

Twitter::Text - Perl implementation of the twitter-text parsing library

# SYNOPSIS

    use Twitter::Text;

    $result = parse_tweet('Hello world こんにちは世界');
    print $result->{valid} ? 'valid tweet' : 'invalid tweet';

# DESCRIPTION

Twitter::Text is a Perl implementation of the twitter-text parsing library.

## WARNING

This library does not implement auto-linking and hit highlighting.

Please refer [Implementation progress](https://github.com/utgwkk/Twitter-Text/issues/5) for latest status.

# FUNCTIONS

## Extraction

### extract\_hashtags

    my \@hashtags = extract_hashtags($text);

### extract\_hashtags\_with\_indices

    my \@hashtags_with_indices = extract_hashtags_with_indices($text, [\%options]);

### extract\_mentioned\_screen\_names

    my \@screen_names = extract_mentioned_screen_names($text);

### extract\_mentioned\_screen\_names\_with\_indices

    my \@screen_names_with_indices = extract_mentioned_screen_names_with_indices($text);

### extract\_mentions\_or\_lists\_with\_indices

    my \@mentions_or_lists_with_indices = extract_mentions_or_lists_with_indices($text);

### extract\_urls

    my \@urls = extract_urls($text);

### extract\_urls\_with\_indices

    my \@urls = extract_urls_with_indices($text, [\%options]);

## Validation

### parse\_tweet

    my \%parse_result = parse_tweet($text, [\%options]);

The `parse_tweet` function takes a `$text` string and optional `\%options` parameter and returns a hash reference with following values:

- `weighted_length`: the overall length of the tweet with code points weighted per the ranges defined in the configuration file.
- `permillage`: indicates the proportion (per thousand) of the weighted length in comparison to the max weighted length. A value > 1000 indicates input text that is longer than the allowable maximum.
- `valid`: indicates if input text length corresponds to a valid result.
- `display_range_start`, `display_range_end`: An array reference of two unicode code point indices identifying the inclusive start and exclusive end of the displayable content of the Tweet.
- `vaildRangeStart`, `valid_range_end`: An array reference of two unicode code point indices identifying the inclusive start and exclusive end of the valid content of the Tweet.

### is\_valid\_hashtag

    my $valid = is_valid_hashtag($hashtag);

### is\_valid\_list

    my $valid = is_valid_list($username_list);

### is\_valid\_url

    my $valid = is_valid_url($url, [unicode_domains => 1, require_protocol => 1]);

### is\_valid\_username

    my $valid = is_valid_username($username);

# SEE ALSO

[twitter-text](https://github.com/twitter/twitter-text). Implementation of Twitter::Text (this library) is heavily based on [Ruby implementation of twitter-text](https://github.com/twitter/twitter-text/tree/master/rb).

[https://developer.twitter.com/en/docs/counting-characters](https://developer.twitter.com/en/docs/counting-characters)

# COPYRIGHT & LICENSE

Copyright (C) Twitter, Inc and other contributors

Copyright (C) utgwkk.

This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.

# AUTHOR

utgwkk <utagawakiki@gmail.com>