RegularExpression#
Module providing routines to check and convert between regular and file expressions.
Utility module fused by ClearMap.Utils.TagExpression
.
- class PatternInverter(groups=None)[source]#
Bases:
object
- at_categories = {AT_BEGINNING: '^', AT_BEGINNING_STRING: '\\A', AT_BOUNDARY: '\\b', AT_NON_BOUNDARY: '\\B', AT_END: '$', AT_END_STRING: '\\Z'}#
- escapes = {7: '\\a', 8: '\\b', 9: '\\t', 10: '\\n', 11: '\\v', 12: '\\f', 13: '\\r', 92: '\\'}#
- in_categories = {CATEGORY_DIGIT: '\\d', CATEGORY_NOT_DIGIT: '\\D', CATEGORY_SPACE: '\\s', CATEGORY_NOT_SPACE: '\\S', CATEGORY_WORD: '\\w', CATEGORY_NOT_WORD: '\\W'}#
- expression_to_glob(expression, replace=None, default='*', ignore='.[]')[source]#
Converts a regular expression to a glob expression, e.g. to search for files
Arguments
- expressionstr
The regular expression.
- replacedict, all or None
A dictionary specifying how to replace specific groups. If all or None, all groups are replaced with the default.
- ignorelist of chars
Ignore these special chars in the regular expression.
Returns
- expressionstr
The regular expression in glob form.
- expression_to_pattern(expression, ignore=None)[source]#
Convert a regular expression to a parsed pattern for manipulation
Arguments
- expressionstr
The regular expression to convert.
Returns
- patternlist
The parsed pattern of the regular expression.
- format_expression(expression, ignore=None)[source]#
Inserts escapes infront of certain regular expression symbols.
Arguments
- expressionstr
The regulsr expresion.
- ignorelist of chars
A list of characters to ignore as regular expressions commands.
Returns
- expressionstr
The regular expression with escaped characters that are ignored.
- glob_to_expression(expression, groups=None, to_group='*')[source]#
Converts a glob expression to a regular expression
Arguments
- expressionstr
A glob expression.
- groupsdict or None
A dicitonary specifying how to name groups in the form {id : name}
- to_grouplist of chars or None
Glob placeholders to convert to a group.
Returns
- expressionstr
The regular expression.
- group_dict(expression, value, as_types=[<class 'int'>, <class 'float'>])[source]#
Returns a dictionary with the values of the groups in the regular expression that match the value string.
Arguments
- expressionstring
The regular expression.
- valuestring
The text to match and extract group values from.
- as_typeslist of types
List of types to try to convert the extracted group value to.
Returns
- valuesdict
The values for each group item.
- group_names(expression)[source]#
Returns the names of groups in the regular expression
Arguments
- expressionstr
The regular expression.
Returns
- nameslist of str
The group names in the regular expression sorted according to appearance.
- insert_group_names(expression, groups=None, ignore=None)[source]#
Inserts group names into a regular expression for spcified groups.
Arguments
- expressionstr
The regular expression.
- groupsdict or None
A dictionary specifying the group names as {groupid : groupname}.
Returns
- expressionstr
The regular expression with named groups.
- is_expression(expression, group_names=None, n_patterns=None, ignore=None, exclude=None, verbose=False)[source]#
Checks if the regular expression fullfills certain criteria
Arguments
- expressionstr
The regular expression to check.
- group_nameslist of str or None
List of group names expected to be present in the expression.
- n_patternsint or None
Number of group patterns to expect. If negative, the expression is expted to have at least this number of groups.
- ignorelist of chars or None
Optional list of chars that should not be regarded as a regular expression command. Useful for filenames setting ignore = [‘.’].
- excludelist of str or None
Exculde these tokens when counting groups.
- verbosebool
If True, print reason for expression to not fullfil desired criteria.
Returns
- is_expressionbool
Returns True if the expression fullfills the desired criteria.
- n_groups(expression)[source]#
Returns the number of groups in the expression.
Arguments
- expressionstr
The regular expression.
Returns
- nint
The number of groups in the epxression.
- pattern_to_expression(pattern)[source]#
Convert a pattern to regular expression
Arguments
- patternlist
The regular expression in pattern form.
Returns
- expressionstr
The regular expression.
- replace(expression, replace=None, ignore=None)[source]#
Replaces patterns in a regular expression with given strings
Arguments
- expressionstr
The regular expression.
- replacedict
The replacements to do in the regular expression given as {pos : str} or {groupname : str}.
- ignorelist or chars
Ignore certain regular expression commands.
Returns
- replacedstr
The regular expression with replacements.
- subpatterns_to_groups(expression, ignore=None, exclude=None, group_names=None)[source]#
Replaces subpatterns with groups in a regular expression.
Arguments
- expressionstr
The regular expression to check.
- ignorelist of chars or None
Optional list of chars that should not be regarded as a regular expression command. Useful for filenames setting ignore = [‘.’].
- excludelist of str or None
Exculde these tokens when counting groups.
- group_nameslist of str
The group names to use for the new groups.
Returns
- expressionstr
The regular expression with subpatterns replaced as groups.