Boost C++ Libraries Home Libraries People FAQ More

PrevUpHomeNext

Path

Every URL defines a path. For opaque schemes such as mailto, the path is considered a single unit which can be used with functions like encoded_path or set_path. In this case it is up to the user to apply the scheme-specific syntax and semantics to further refine the URL contents, either for validation or interpretation. For URLs using hierarchical schemes such as http or wss, the path is interpreted as a slash delimited sequence of percent-escaped strings called segments. The following URL contains a path with three segments: "path", "to", and "file.txt":

http://www.example.com/path/to/file.txt

We use the word path to refer to the path string, and the word segments to mean a slash delimited sequence. In this library segments are represented using containers modeling bidirectional ranges. For example the member function encoded_segments returns a container called segments_encoded_ref which may be iterated, and references the underlying character buffer without taking ownership. Here we define the function segs which returns a std::list formed by appending each segment in the path:

auto segs( url_view const& u ) -> std::list< std::string >
{
    std::list< std::string > seq;
    for( auto s : u.encoded_segments() )
        seq.push_back( s.decode() );
    return seq;
}
[Note] Note

In the remainder of this section we use the sequence returned from a call to the segs function above to demonstrate the outcomes of parsing and modification of path segments. For example, calling

segs( url_view( "/path/to/file.txt" ) )

produces the sequence.

{ "path", "to", "file.txt" }

The term sequence always refers to the elements returned by segs.

We start with the following invariants about paths:

In this table we show the result of invoking segs with different URLs containing paths. This demonstrates how the library achieves the invariants described above for various interesting cases:

Table 1.4. Path Sequences

s

segs( url( s ) )

absolute

""

{ }

"/"

{ }

yes

"./"

{ "" }

"./usr"

{ "usr" }

"/index.htm"

{ "index.htm" }

yes

"/images/cat-pic.gif"

{ "images", "cat-pic.gif" }

yes

"images/cat-pic.gif"

{ "images", "cat-pic.gif" }

"/fast//query"

{ "fast", "", "query" }

yes

"fast//"

{ "fast", "", "" }

"/./"

{ "" }

yes

".//"

{ "", "" }


Library algorithms which modify individual segments of the path or set the entire path attempt to behave consistently with the behavior expected as if the operation was performed on the equivalent sequence. If a path maps, say, to the three element sequence { "a", "b", "c" } then erasing the middle segment should result in the sequence { "a", "c" }. The library always strives to do exactly what the caller requests; however, in some cases this would result in either an invalid URL, or a dramatic and unwanted change in the URL's semantics.

For example consider the following URL:

url u = url().set_path( "kyle:xy" );

The library will produce the URL string "./kyle:xy" and not "kyle:xy", because the latter would have a scheme which is clearly not intended. This table shows a URL string, a modification operation, and the URL string which results from applying the operation:

Table 1.5. Path Operations

URL

Operation

Result

"info:kyle:xy"

remove_scheme()

"kyle%3Axy"

"kyle%3Axy"

set_scheme( "gopher" )

"gopher:kyle:xy"

"http://www.example.com//kyle:xy"

remove_authority()

"http:/.//kyle:xy"

"//www.example.com//kyle:xy"

remove_authority()

"/.//kyle:xy"

"http://www.example.com//kyle:xy"

remove_origin()

"/.//kyle:xy"

"info:kyle:xy"

remove_origin()

"kyle%3Axy"

"/kyle:xy"

set_path_absolute( false )

"kyle%3Axy"

"kyle%3Axy"

set_path_absolute( true )

"/kyle:xy"

""

set_path( "kyle:xy" )

"kyle%3Axy"

""

set_path( "//foo/fighters.txt" )

"/.//foo/fighters.txt"

"my%3Asharona/billa%3Abong"

normalize()

"my%3Asharona/billa:bong"

"./my:sharona"

normalize()

"my%3Asharona"


For the full set of containers and functions for operating on paths and segments, please consult the reference.


PrevUpHomeNext