Home | Libraries | People | FAQ | More |
Every URL defines a path.
For opaque schemes such as mailto,
the path is considered a single unit which can be used with functions like
encoded_path
or set_path
.
In this case it is up to the user to apply the scheme-specific syntax and
semantics to further refine the URL contents, either for validation or interpretation.
For URLs using hierarchical schemes such as http
or wss,
the path is interpreted as a slash delimited sequence of percent-escaped
strings called segments. The following URL contains
a path with three segments: "path", "to", and "file.txt":
http://www.example.com/path/to/file.txt
We use the word path to refer to the path string, and the word segments to
mean a slash delimited sequence. In this library segments are represented
using containers modeling bidirectional ranges. For example the member function
encoded_segments
returns a container called segments_encoded_ref
which may be iterated, and references the underlying character buffer without
taking ownership. Here we define the function segs
which returns
a std::list
formed by appending each segment in the path:
auto segs( url_view const& u ) -> std::list< std::string > { std::list< std::string > seq; for( auto s : u.encoded_segments() ) seq.push_back( s.decode() ); return seq; }
Note | |
---|---|
In the remainder of this section we use the sequence returned from a call
to the segs( url_view( "/path/to/file.txt" ) ) produces the sequence. { "path", "to", "file.txt" }
The term sequence always refers to the elements returned by |
We start with the following invariants about paths:
In this table we show the result of invoking segs
with different
URLs containing paths. This demonstrates how the library achieves the invariants
described above for various interesting cases:
Table 1.4. Path Sequences
s |
|
absolute |
---|---|---|
|
|
|
|
|
yes |
|
|
|
|
|
|
|
|
yes |
|
|
yes |
|
|
|
|
|
yes |
|
|
|
|
|
yes |
|
|
Library algorithms which modify individual segments of the path or set the
entire path attempt to behave consistently with the behavior expected as
if the operation was performed on the equivalent sequence. If a path maps,
say, to the three element sequence { "a", "b",
"c" }
then erasing the middle segment should result in the
sequence { "a", "c" }
. The library always
strives to do exactly what the caller requests; however, in some cases this
would result in either an invalid URL, or a dramatic and unwanted change
in the URL's semantics.
For example consider the following URL:
url u = url().set_path( "kyle:xy" );
The library will produce the URL string "./kyle:xy"
and not "kyle:xy"
, because the latter would have a
scheme which is clearly not intended. This table shows a URL string, a modification
operation, and the URL string which results from applying the operation:
Table 1.5. Path Operations
URL |
Operation |
Result |
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
For the full set of containers and functions for operating on paths and segments, please consult the reference.