Hyperstrings
Programmer’s manual.
By Alexey Pismenny
Overview
Hyperstrings, or in abbreviated
form, Hystrings are strings best suited for syntax analysis.
Unlike the container string classes where the instance of the class
allocates storage for the string, a Hystring is a collection of
pointer-length pairs that together form several segments of the
string. The storage, however, is not the responsibility of the
Hystring class. Thus, the string storage is separated from the string
analysis and manipulation.
Imagine, for example, a large text:
an article or a book. If our task as a programmer is to somehow
analyze the book’s structure it would be to our advantage to
allocate the storage for the book once and then make references to
the structural elements of the book without copying any of its
substrings into memory allocated in addition to the main book
storage. Our task might be to pick out the chapter titles in order to
make a table of content. The most direct approach is to form a vector
of std::wstring objects and populate it from the substrings of the
main string, which is the entire book, that are the titles. Now, in
addition to the storage allocated for the book we allocated storage
for its chapter titles. In order to create a table of content we will
have to combine the titles from our vector with page numbers and
formatting and allocate yet another string for the table of content.
Now each chapter title is stored in three places: in the book itself;
in the vector of the titles; and in the table of content string. The
processing done in that style consists of little more than allocating
memory and copying characters from one place to another.
Perhaps, the next task is to break
our book into pages. Again, the naïve approach would be to
allocate another vector of strings, which are the substrings
corresponding to each page. More copying, and the required storage
had to double: each character in the book is now stored twice: in the
original book and in the page where it belongs.
If the book has been analyzed in
this fashion and now needs to be edited, each edited character has to
be updated in two places. If the chapter title is edited, -- in at
least three places!
But at the same time the naïve
approach has the benefit of simplicity. The program that forms the
table of content and the pages practically writes itself in a few
self-evident lines of code. So-called container semantics are great
for casual string processing. They become an exercise in wasting time
and space for large text processing.
Using Hystrings, the same program
can be written with comparable clarity. However, the only large space
allocation will be for the book itself. The chapter titles, the table
of content and the individual pages will be pointers to the same book
storage coupled with the segment length.
More generally, a Hystring is a
vector of segments, and a segment is a pointer-length pair. The
pointers must be pointing to a collection of regular C strings called
Space. Space could be a single C-string (for example, the
entire book) or it could be a container of strings. The only
requirement is that the Hystring segment pointers should point to
allocated memory. The DEBUG version of the library even validates
each pointer against the registered space container and nulls the
invalid pointers.
To search for a substring of a
Hystring means to walk its segments and examine the characters
pointed to. To concatenate two Hystrings is to push the segments of
the second Hystring to the vetctor which is the first Hystring. To
delete a substring of a Hystring is to find the segment corresponding
to the first deleted character, and to the last deleted character. If
they are the same segment, shorten it and insert its tail as a new
segment. If the last character deleted is in another segment, shorten
the first segment, remove the intervening segments, and insert a new
segment made of the tail of the last segment. As you can see, the
internal algorithms of Hystring are sometimes non-trivial compared to
the scanning and array insertions and contractions done with
conventional C strings.
A Hystring can be converted to a
conventional string by walking the segments and copying into the new
contiguous storage. A conventional string can be registered with the
Hystring Space and then a Hystring can be made out of it.
Indeed some Hystring operations do
not conserve space at all. For example, to revert a string of 10
characters backward ten pointers and ten length values (of 1) must be
allocated in addition to the original string. However, for typical
text analysis operations: defining sections of text and making
combinations of them, Hystrings offer a significant advantage over
container-semantics strings.
{pg}
Introduction to Hystring
class and Space class
Let us discuss the Hystring header
({hystring.h}). It first defines the pointer-length pair, called
Segment. It is of these segments that the Hystring is made up; among
the protected data members we see
vector
< Segment > segments;
Hystring has rather self-evident
default and copy constructors.
Hystring has two constructors from
conventional strings:
Hystring(
const
char_t* text, unsigned
int
size );
Hystring(
const
char_t* text );
The difference between the two is
that the second constructor takes a null-terminated string. Caution
should be exercised when constructing a Hystring from conventional
strings of limited scope. This, for example, would be a programming
error:
Hystring
getProjectName()
{
char_t
name[] = {T("Eggbuilder")};
return
Hystring(name);
}
Unlike strings with container
semantics, Hystring is a collection of pointers. Since name
goes out of scope when the function returns, the
Hystring returned is unusable. Of course, a similar mistake can be
made if simply a pointer to the string declared in the function’s
scope were returned, -- that would be a typical beginner’s
mistake in C programming. Hystring, however, has a mechanism to
reduce such errors. Separate from Hystring, there is a singleton
class Space. Space is a single container for all text material out of
which we shall be constructing Hystrings. It can be described as a
string heap. The elements of Space are SpaceUnit instances (see
{space.h}).
The programmer is supposed to first
allocate his C strings in the Space. Then he can use them, or their
substrings as a building material for Hystrings. The Hystring
constructors expect that the string text is allocated in the Space.
The constructor can check if the constructor parameter is in the
Space by finding a SpaceUnit whose address range includes the entire
parameter string.
It would be cumbersome to always
enforce that the text
parameter is allocated in the Space right in the constructor; on the
other hand, not enforcing the rule at all would decouple Hystring
from Space altogether. We don’t want to do that because we
don’t want the programmer to have the burden of managing his
strings underlying Hystring.
The solution is to enforce the rule
in debug mode, but not in release. That is what
void
Segment::validate()
does. In debug mode, it asks the
Space singleton to validate the new segment. Space then walks its
SpaceUnit elements looking for one that encompasses the
pointer-length pair. Invalid segments are nulled (set to zero pointer
of zero length). The programmer might prefer to throw an exception,
but I found this technique most helpful exactly as implemented. If I
know that I was building my Hystring from a real string, and yet the
debugger shows that it has no length and points to nothing, I
immediately realize that I have a problem with the string’s
scope.
Let us look at Space header more
closely ({space.h} ). Its building block is SpaceUnit. That is also
a pointer-length pair (similar to the Hystring’s segments).
However, Space Unit also keeps track of the C string allocation. The
Boolean data member own
indicates whether the SpaceUnit instance also should deallocate the
pointer in the destructor. While SpaceUnit is similar to Segment, its
purpose is different. The SpaceUnit is supposed to be “large”,
whereas the Segment inside a Hystring is typically some “small”
substring of the SpaceUnit instance. For example, in our introductory
example the programmer would allocate one SpaceUnit for the entire
text of the book and that SpaceUnit would be the only unit in his
Space. If the programmer needs to work with two files, it would make
sense to have a Space comprising two SpaceUnits, one for each file.
Space is a map of SpaceUnit
instances, identified by a short integer handle. Space itself is a
singleton, whose instance methods can only be accessed through
static
Space* instance()
Space has some sophisticated methods
and elements that are utilized by classes that derive from Hystring.
We shall postpone studying those till we need them in the Hystring
derivations. The simple way to use Space is simply to pick a handle
for each SpaceUnit that you need, and call Space::reg
to create the SpaceUnit. For example, Eggbuilder reads
two files for input, the “source” and the “equates”.
The files are read into C strings pSource and pContent. In order to
process the two strings as Hystrings, two SpaceUnits are registered
(see {template.cpp}):
hsp_content
= Space::instance()->reg( HSP_CONTENT, pSource );
hsp_equates
= Space::instance()->reg( HSP_EQUATES, pEquates );
The all-capital handles are
suggested values; the reg method returns the same handle if it is
available, otherwise zero. Another approach is to use 0xffff for
suggested handle and Space will then return a handle value used
internally.
At this point Hystrings can be built
from the entire registered strings or from some parts of it. Note
that a Hystring can be built over segments from different SpaceUnits.
This, in fact, happens a lot in Eggbuilder, when the variables
occurring in pSource are replaced by values given to them in
pEquates. Here is a simple example:
char_t
pSource[] = T(“The dinner today is {dinner}.”);
char_t
pEquate[] = T(“dinner=fish”);
unsigned
short h1 = Space::instance()->reg( 0xffff, pSource );
unsigned
short h2 = Space::instance()->reg( 0xffff, pEquate );
Hystring
src( pSource ); // entire source
Hystring
eqt( pEquate ); // entire equate
//
build the equate
Hystring
name, value;
eqt.split(
T(’=’), name, value );
//
evaluate the source
//
split is explained in the next chapter; it will produce
// outs[…]
= “The dinner today is ”, “.”
// ins[0]
= “dinner”
vector
< Hystring > outs, ins;
unsigned
int count = src.split( T(’{’), T(’}’), outs,
ins );
if(
count > 0 && count != npos )
{
//
Variables (count) found in the source
//
Let us find their values
Hystring
result = outs[0]; // “The dinner today is “
for(
nVar = 0; nVar < count; ++nVar )
{
if( ins[nVar] == name )
{
//
this variable has a value
result.append(
value ); // “The dinner today is fish”
}
else
{
result.append(
T(’?’) );
}
}
}
//
here’s the evaluation result
result.append(
outs[count] ); // “The dinner today is fish.”
//
free the Space
Space::instance()->unreg(
h1 );
Space::instance()->unreg(
h2 );
Note that result has three segments,
two from the source SpaceUnit and one from the equates SpaceUnit.
{pg}
Hystring functionality
Let us read together the main
Hystring header, {hystring.h}.
The public sub-struct CharLocator
is a convenient way to identify a character of the Hystring, similar
to how a character index identifies a character of a conventional
string. CharLocator,
however, consists of two numbers, indexing the segment and the
character inside the segment.
Hystring also has assignment and
comparison operators. Consistently with the design objective,
assignment does not copy or move the string content pointed to by the
segments. The segment structure on the right hand side of the
assignment is, of course, copied to the assignee. Comparisons are
done with respect to the text represented by the Hystring; so two
Hystrings are equal whenever they yield the same sequence of
characters, regardless of the segment structure of the two. The core
method for all comparisons is
int
Hystring::compare( const
Hystring& other, unsigned
int&
len, unsigned
int
start ) const
If the programmer wishes to
understand the internals of the Hystring implementation, reading
through the Hystring::compare
method would be a good place to start. Note that there are actually
several Hystring::compare
methods allowing comparisons with std::string and C strings.
Reading down the header we find
length method and several classification methods.
Next we see a group called “locator
arithmetic”. They produce CharLocator values for the given
Hystring. Most Hystring internals use them as convenient abstractions
of the actual segments.
Following that, we find many methods
that one would expect in any string class. Usually, multiple versions
exist for each family of algorithms. When that is the case, there are
typically one or two universal methods that take most parameters, and
many simpler methods that are implemented in terms of the complex
ones.
Several method families deserve
special mention. To convert a Hystring to a conventional string, use
any of the print
methods.
The find_select
is indispensable in any syntax analysis; it allows
scanning the string looking for either of several target strings and
passing back the position of the target as well as the index of the
target.
There is a rich with features split
family of methods. In its simplest form, it breaks a Hystring around
a given token. For example, to split a Hystring into comma-separated
fragments, use:
vector
< Hystring > fragments;
unsigned
int count = myHystring.split( T(’,’), fragments );
If only two fragments are expected,
there is a simpler form
Hystring
left, right;
unsigned
int count = myHystring.split( T(’,’), left, right );
Returned is the number of split
fragments, so 1 indicates that the token was not found; there is
always at least one element in the fragment vector.
Often, a programmer needs to break
his string into fragments inside the brackets and outside the
brackets; that is, instead of a single token there are two tokens and
two fragment vectors. For example:
vector
< Hystring > outs, ins;
unsigned
int count = myHystring.split( T(’{’), T(’}’),
outs, ins );
This method returns the count of
inside fragments, that is the count of ins[]. The count of outs[] is
always the count of ins[] plus 1.
At times it is convenient to look
for several types of brackets at the same time. For example, consider
this marked-up text:
Dinner
today is {dinner}. <B>Don’t forget</B> to bring
wine!
Here two types of brackets are used,
braces (for variable substitution as you may guess) and the usual
HTML bold tags. Now instead of a single open bracket and a single
close bracket we need a vector of open brackets, the same-size vector
of close brackets, and an output array of indices into them, that
will tell us for each of the ins[] element, which type of bracket it
was split from.
vector
< Hystring > outs, ins;
vector
< Hystring > open, close;
vector
< Hystring > idx;
open.push_back(
HystringAuto( T(’{’) ) ); // see next chapter
open.push_back(
HystringAuto( T(“<b>”) ) );
close.push_back(
HystringAuto( T(’}’) ) );
close.push_back(
HystringAuto( T(“</b>”) ) );
unsigned
int count = myHystring.split( open, close, idx, outs, ins );
This will return 2 and the vectors
will be:
idx[0]
== 0; // in[0] is inside brackets open[0], close[0]
idx[1]
== 1; // in[1] is inside brackets open[1], close[1]
outs[]
== “Dinner today is ”, “. ”, “ to bring
wine!”
ins[]
== “dinner”, “Don’t forget”
Even more complex, but also
powerful, is find_in_split.
This solves the following problem. Suppose you have a text with
nested brackets (this time, all of the same kind). Suppose that you
want to find a target string, but ignore the target strings that are
inside brackets. Or more generally, you want to find the target
string on nesting level aLevel.
For example, we might want to find “dinner” when it is
inside the braces, but not inside two layers of braces, and not
outside of all braces. We will then specify aLevel
= 1. This scenario must account for the possibility
that the target string itself contains brackets, which are not
necessarily balanced inside the target. To account for that, the
method passes back the bracket level just after the target. If the
target contained, for example an unbalanced opening bracket, the
aLevel on
the output will be the aLevel
specified on the input plus 1.
{pg}
Classes that derive from
Hystring
HystringAuto
The necessity to allocate the space
for the Hystrings is a natural requirement in file processing, when
the entire file content is allocated right after it is read. But
sometime we need to make a Hystring out of a string of limited scope,
in order to use the powerful analytical methods that come with
Hystring. Hystring Auto is a kind of Hystring that is well-suited for
underlying C strings of limited scope. We already used a HystringAuto
in the preceding example of split:
open.push_back(
HystringAuto( T(’{’) ) );
open.push_back(
HystringAuto( T(“<b>”) ) );
In that example, open was a vector
of Hystrings declared in some limited scope that was not shown in the
example. The vector elements were initially empty Hystrings,
requiring no corresponding SpaceUnit. Now we want to create a
Hystring out of a single character or out of a short C string. That
would cause the Hystring constructor to build a segment out of the
temporary strings “{“ and “<b>”, and in
Debug the Segment constructor will fail. We don’t want
Hystrings to be built from temporary C strings, because such Hystring
would have hanging pointers if it were to leave the scope of the
underlying C strings.
Certainly we don’t want to
register all the sundry strings that we might use inside various
methods ahead of time in some initialization section. Instead, we
want a derived class that would register the string with which it is
constructed just for the scope of its existence, and then the
destructor would unregister it.
And that is what HystringAuto is.
While its instance exists, the string is registered; when the
instance goes out of scope, it is unregistered. Because the class is
anticipated to be frequently used with single characters, it makes a
single character into a string internally. When built over a string,
that string is not copied and is registered as passed. The programmer
must ensure that the scope of the underlying C string is not shorter
than the scope of HystringAuto. It is not recommended to concatenate
HystringAuto instances. If the programmer feels the need to do string
arithmetic on HystringAuto, he should write the necessary Append
method that ensures that the appended segments remain registered.
This class has been supplanted by
HystringContainer in the Eggbuilder project.
HystringContainer
A desire to build a Hystring that
allocates itself but also can be appended, inserted into, its
substrings replaced, -- all that without worry that the underlying C
string might go away, -- has led to the development of this class.
Since HystringContainer’s anticipated use is casual, and often
over short limited scope strings, we do not allocate and register a
string each time a HystringContainer instance is needed. Instead we
maintain a pool of registered strings whose size is a multiple of
fixed “chunk size”. As another HystringContainer is
created, the init method tries to push the underlying C string on the
same handle that was used previously, appending the new string to the
pooled string. If the new string cannot be appended within the size
already allocated, a similar attempt is done with older pooled
strings. Finally, if no suitable pooled string exists, a new pooled
string is allocated and accommodates the new string.
This works well for relatively short
strings, which are concatenated together into a few in number pooled
strings. Nothing happens when a HystringContainer is destructed: its
underlying string will be released when the entire Space is
destructed at the end of the program. For that reason, it is not
desirable to use HystringContainer for temporary variables that need
to allocate changing string content. Eggbuilder uses
HystringContainer mostly for its fixed syntactic elements such as
keywords and delimiters.
In order to understand how
HystringContainer manages its pool of registered strings, read
HystringContainer::init.
HystringContainer maintains a static pool of handles
static
vector < ContainerHandle > container_handles;
HystringContainer::init method
cooperates with Space::push method. The latter attempts to append a
copy of the string onto the given handle while maintaining the given
string size. The granularity with which the pooled string size is
calculates is also determined by Space::chunk_size(). It is currently
set to 0xfff = 4095 bytes.
Important feature of
HystringContainer is that its pointer can be taken directly as
const
char_t* ptr()
For regular Hystrings that would not
be possible, because a Hystring may have multiple segments, and even
one segment is meaningless without knowing its length. However,
HystringContainer is always allocated in a single contiguous storage
in the Space and it is null-terminated. So a direct pointer to its
allocation can be produced.
HystringContainerNumber
Very often, string content is
various representations of numbers. Pages, and sometimes paragraphs
need to be numbered; arrays and tables have rows and column numbers.
In text processing, we still do number arithmetic; however, we also
distinguish between numbers that are the same numerically but are
differently formatted. We need to process numbers in the same way we
process other Hystrings; at the same time it makes little sense to
store several copies of the same string representing the same format
of the same numerical value. Yet that is what would happen if numbers
were stored as HystringContainers and converted to numerical value on
the fly.
HystringContainerNumber derives from
HystringContainer and resolves these issues, at least for whole
numbers. Let us read {Hystring_container_number.h} together.
First, we define flags and format
specifiers that apply to whole numbers. A FormattedNumber is defined
as a struct containing the numerical long value, its format,
formatting flags and the width. We define equality and comparison
operator that take into account the formatting attributes as well as
the numerical value.
We decide to create a special
storage for numbers, separate from the pooled strings used by
HystringContainer parent. That is because we do not want to store the
same FormattedNumber several times. Our number storage, HCN_map, is a
map keyed by FormattedNumber, pointing to the familiar to us
handle-string pair, which this class calls HCN_storage.
Instead of a public constructor, --
that would always produce a new class instance when called, -- we
create HystringContainerNumber instances from a factory. We define
several commands that the Factory takes: HNF_Init
and HNF_Exit must
be called when the HyperString library is initialized in HS_Init()
and HS_Exit();
HNF_Make
produces a HystringContainerNumber, and HNF_Delete
deletes it from the storage.
Next, let us examine the Factory.
The execution of HNF_Init
and HNF_Exit is
straightforward: the _numbers
storage is respectively allocated or freed.
For HNF_Make
the Factory makes a local instance of FormattedNumber and looks for
it in the _numbers
storage. If that FormattedNumber is already allocated, the Factory
returns a copy of the allocated instance, and passes back the Space
handle in case the caller needs it. If the FromattedNumber is not in
the storage yet, the Factory uses the protected constructor to
allocate the string representation of the number as a
HystringContainer, and inserts its Space handle and the pointer in
the _numbers
storage.
HNF_Delete
removes and erases the given FormattedNumber if it had been stored.
HystringContainerNumber can be
incremented. The method works much like the factory’s HNF_Make.
{pg}
Hystring
Library
Hystring is a complex class, not
even counting the derived classes. A functional wrapper around
Hystring is Hystring Library. The library is designed to resolve
several parsing problems using Hystring class as its internals.
The library is not large. It is
implemented in two files, hslib.h and hslib.cpp.
The main purpose is to support
scenarios when multiple types of bracket pairs exist and need to be
parsed. The “base” bracket pair is something like “{“
and “}”. Those need to be define once at the library
initiation. Additional “custom” bracket pairs can be
constructed using the base. For example {begin} and {end} are
intuitively recognizable bracket pairs that can be created
dynamically and parsed against. Further, at times we want different
opening custom bracket to be closed with the same closing custom
bracket: for example, it might be a reasonable syntax to have opening
brackets “{begin}”, “{block}”, “{if}”,
“{for}” all close on the same closing bracket “{end}”.
Because different opening brackets may associate with their own sets
of closing brackets, they are called “associative brackets”.
Another desired feature is opening
bracket embedding other code or variables. For example, it would be
nice to be able to define only “{if}” and “{end}”
as a custom bracket pair, but actually parse expressions like “{if:
x > 0} … {end}”. This task is facilitated by Hystring
Library as well. The idea is to use unbalanced opening brackets, but
parse to the rest of the text till the base bracket is balanced. The
remainder between the unbalanced custom bracket and the point when it
becomes balanced in terms of the base is called the complement. For
example, if we define “{if: ” as an opening bracket, the
library will parse “{if: x > 0} … {end}” and
define “x > 0” as the opening bracket's complement.
The plane text between two brackets
is called inside fragment. So, given proper bracket definitions,
“{if: x > 0} it's
positive! {end}”
will parse as follows:
“{if: “ is the associative
opening bracket
“x > 0” is the
complement
“it's positive” is the
inside fragment
{end} is the associated closing
bracket.
Let us examine the header file.
void
HS_Init(const
char_t*
baseOpen, const
char_t*
baseClose)
This
needs to be called once. Currently, there is no corresponding
HS_Exit.
The parameters define the base brackets.
typedef
pair
< Hystring,
vector
< const
Hystring
> > HystringAssociation;
This
is the fundamental type for the library, in this context called
shortly association. It associates a given string with an array of
its closing strings. For example, in the Egg language "{set"
can be closed by "{endset}" or "{endif}" or
"{endfor}".
struct
HS_Fragment
{
unsigned
short
idx; //
index into opens
unsigned
short
minor; //
index into opens[idx].second
Hystring
in; //
inside fragment
Hystring
complement; //
complement
};
Another
fundamental Hystring Library type. This is what the parser returns
for each parsed fragment. The index 'idx' is into a given array of
associations. The other index, 'minor', is into the array of possible
closes for the opening bracket that defined the fragment. String 'in'
is the inside fragment, i.e. the text between entire (complemented)
opening bracket and the closing bracket. String 'complement' is what
complemented the opening bracket to balance.
string_t
HS_Print(const
vector
< Hystring
>& strings);
string_t
HS_Print(const
HystringAssociation&
assoc);
string_t
HS_Print(const
vector
< HystringAssociation
>& opens);
string_t
HS_Print(const
HS_Fragment&
fragment);
string_t
HS_Print(const
vector
< HS_Fragment
>& fragments);
These
print the corresponding object for debugging and logging purposes.
unsigned
int
HS_Split(const
Hystring&
src, const
vector
< Hystring
>& tokens, vector
< HS_Fragment
>& fragments, unsigned
int
maxCount = 0);
This
function splits the given string between tokens. No associations are
involved. If maxCount is specified, the function quits after finding
maxCount fragments. By convention,
fragments[n].idx is index into the array of tokens for the token to
the left of the fragment (or 0xffff for the leftmost fragment);
fragments[n].minor is index into the array of tokens for the token to
the right of the fragment (or 0xffff for the rightmost fragment).
unsigned
int
HS_AssociativeSplit(
const
Hystring&
src,
const
vector
< HystringAssociation
>& opens,
vector
< Hystring
>& outs,
vector
< HS_Fragment
>& fragments,
unsigned
int
maxCount = 0);
This
is the most powerful parsing method in this library. Fpr a given
string and a given array od associations, pass back the array
of strings falling outside of matching brackets, and an array of
fragments inside associated brackets. Optionally, limit the parsing
to a given number of fragments.
Other
methods:
unsigned
int
HS_FindMatch(const
Hystring&
src, const
vector
< HystringAssociation
>& opens, unsigned
int
nodeOpen, const
CharLocator*
locNode, CharLocator*
locNext, CharLocator*
locMatch = 0);
This
is a helper method.
Knowing that nodeOpen is found and locOpen points just after it, find
a matching close, recursing on nested opens. Return index into the
associated closes vector; pass back the locator of the matching close
and the locator right after the matching close.
int
HS_BracketLevel(const
Hystring&
src, const
char_t*
baseOpen = 0, const
char_t*
baseClose = 0);
This
is a helper method. It computes the balance of given opening and
closing brackets in a string. By default, the base brackets are used.
unsigned
int
HS_BracketLevelSubstr(
const
Hystring&
src, //
source string
int
lvlTarget, //
level to look for
Hystring&
aSub, //
resulting substring
const
CharLocator*
pLocBegin = 0, const
CharLocator*
pLocEnd = 0, //
brackets of extraction in src
const
char_t*
baseOpen = 0, const
char_t*
baseClose = 0 //
base brackets
);
This
is another helper. From a given string, a given base bracket level is
extracted. For example, using
the string "A{B}C}E" and looking for level -1 we find,
starting from pos 0, "A{B}C".