Godot RegEx (Regular Expressions)
When we want to search for or replace a particular text pattern in Godot it is easy to do so using String functions, but when the search string is not exact such as when there might be regional variations, or we want to find a kind of word matching a pattern, we may use RegEx (Regular Expressions).
But, this strikes fear into many people’s minds at the very thought since Regular Expressions are deemed very hard to figure out by many people. So I will try to explain them in very easy to understand terms.
A RegEx is a sequence of characters specifying a search pattern to match with text that is scanned from beginning to end where the pattern is tried to match to the underlying text, like if you were looking for a word on the page of a book. It may be as simple as abc
to match with any text containing this pattern. This will match anywhere such as at the beginning, middle, and end of the text. Also, it might be matched several times.
In Godot we use the RegEx
class where we create a new RegEx
object, then compile our search pattern.
var regex = RegEx.new()
regex.compile("abc") # Compile our pattern
Now we may use the search
method to get the first match or search_all
to get all the matches.
var txt = "abc xyz abcdefg"
var regex = RegEx.new()
regex.compile("abc")
var result = regex.search(txt)
if result:
print(result.get_string()) # prints abc
result = regex.search_all(txt)
if result:
print(result) # prints an array of the search matches
We usually want to specify ranges or sets of characters to match such as lowercase letters, we may do this like so: [a-z]
This matches a single lowercase letter. We may do the same for numbers [0-9]
and upper case letters [A-Z]
. There are short-cut codes for these patterns, but as a beginner it’s easier to remember these patterns (for digits the code is \d
).
- [a-z] matches ‘a’ and ‘b’ in “aXFGTb1234”
We may also specify individual characters in a set such as: [xyz0-9] where a single character will be matched with any in this set of characters.
Another useful pattern is to match characters that are not in a set of characters such as: [^x2@] This will match any character that is not x, 2, or @.
- [^dxg] matches ‘z’, ‘2’, and ‘X’ in “dzxg2Xdg”
We also want to specify how often the pattern should be matched in a row such as once, 2 characters, zero or more etc. We add special symbols after our pattern to specify these.
?
The question mark indicates zero or one of the preceding element.*
The asterisk indicates zero or more of the preceding element.+
The plus indicates one or more of the preceding element.{n}
Match the preceding element exactly n times.{n,}
Match the preceding element at least n times.{n,m}
Match the preceding element between n and m times.
The wildcard .
matches any character.
dogs?
matches “dog” in “dog snacks” and matches “dogs” in “the dogs barked”10*
matches “1” in “312” and matches “100” in “100,000”x+
matches “x” in “dx10” and matches “xxxx” in “0xxxx + 3”6{2}
matches “66” in “666”6{2,3}
matches nothing in “6” but matches “666” in “66666”
To match tabs and newlines we use escape characters such as \t
and \n
. The backslash is also used to escape any of the special characters such as dot and asterix where we actually want to match these characters e.g. \.
[a-z]+\.com
matches website.com
in http:\\website.com
By default, matches tend to be greedy capturing the longest character sequence that matches. But to capture the shortest, we should add a non-greedy specifier after our pattern using the question mark symbol.
a+?
matches “a” in “aaaaa”
A couple more symbols we can use are to specify from the start of the string with ^
at the start of the pattern, and to the end of the string with $
at the end of the pattern.
^start
matches “start” in “start string” but does not match anything in “Go to start”end$
matches “end” in “to the end” but does not match anything in “the end is nigh.”
To extract sub pattern matches, we use braces to group the patterns. Then the result array will contain these sub-pattern matches. Groups are also useful for specifying a list of possibilities separated with the or
operator which is a vertical bar |
.
[a-z]+\.(.+)
matcheswiki.org
and “org” inwww.wiki.org
in the results array(dog|cat)
matches “cat” in “the cat sat on the mat”
Note that dealing with line breaks can be problematical, so it’s a good idea to remove them before applying a RegEx to your text. But to capture lines you can prefix your regex pattern with: (?s)
Spaces may be entered as actual spaces or by using \s
.
You can find various RegEx testers online to try out your RegEx patterns before you commit them to code.
One tip is to download and print out a RegEx Cheat Sheet since there are a lot of directives that I didn’t mention that may be hard to remember if you are not writing them often.
In summary: you may use RegEx to detect if patterns exist in text such as website URLs and telephone numbers, or use it for searching and replacing text patterns.
So this is a quick overview of what I think are the most useful things to know about RegEx and I hope that it helps.
You can read the official Godot RegEx docs here
More solutions
- Godot Keyboard and Mouse Button Input Programming
- Godot Event Handling
- Signals in Godot
- How to Save and Load Godot Game Data
- Godot Timing Tutorial
- Using Anchor Positioning in Godot
- UI Layout using Containers in Godot
- Shaders in Godot
- Godot State Machine
- Godot Behaviour Tree
- Godot Popups
- Parsing XML Data
- Godot Parallax Background
- How to Make a Godot Plugin
- Random Numbers
- Coroutines, Await and Yield
- GraphNode and GraphEdit Tutorial