Home | Libraries | People | FAQ | More |
This section is intended to give the reader a brief overview of the features and interface style of the library.
Note | |
---|---|
Sample code and identifiers used throughout are written as if the following declarations are in effect: #include <boost/url.hpp> using namespace boost::urls; |
We begin by including the library header file which brings all the symbols into scope.
#include <boost/url.hpp>
Alternatively, individual headers may be included to obtain the declarations for specific types.
You will need to link your program with the Boost.URL built library. You must install binaries in a location that can be found by your linker.
If you followed the Boost Getting Started instructions, that's already been done for you.
To use Boost.URL as header-only; that is, to eliminate the requirement to link a program to a static or dynamic Boost.URL library, place the following line in exactly one new or existing source file in your project.
// In exactly *one* source file #include <boost/url/src.hpp>
Then define BOOST_URL_NO_LIB
and include the library headers in any file that might uses Boost.URL.
// In any other source file #define BOOST_URL_NO_LIB #include <boost/url.hpp>
This "header-only" configuration needs BOOST_URL_NO_LIB
defined when building with compilers supporting auto-linking, such as Microsoft
Visual C++. The macro will instruct Boost to deactivate auto-linking.
Say you have the following URL that you want to parse:
string_view s = "https://user:pass@example.com:443/path/to/my%2dfile.txt?id=42&name=John%20Doe+Jingleheimer%2DSchmidt#page%20anchor";
In this example, string_view
is an alias to boost::core::string_view
,
a string_view
implementation
that is implicitly convertible to std::string_view
.
The library namespace includes the aliases string_view
, error_code
, and result
.
You can parse the string by calling this function:
result<url_view> r = parse_uri( s );
The function parse_uri
returns an object of type
which is a container resembling a variant
that holds either an error or an object. A number of functions are available
to parse different types of URL.
result
<url_view
>
We can immediately call result::value
to
obtain a url_view
.
url_view u = r.value();
Or simply
url_view u = *r;
When there are no errors, result::value
returns an instance of url_view
,
which holds the parsed result. result::value
throws an exception on a parsing
error.
Alternatively, the functions result::has_value
and result::has_error
could also be used to check
if the string has been parsed without errors.
Note | |
---|---|
It is worth noting that As long as the contents of the original string are unmodified, constructed URL views always contain a valid URL in its correctly serialized form.
If the input does not match the URL grammar, an error code is reported through
|
Accessing the parts of the URL is easy:
url_view u( "https://user:pass@example.com:443/path/to/my%2dfile.txt?id=42&name=John%20Doe+Jingleheimer%2DSchmidt#page%20anchor" ); assert(u.scheme() == "https"); assert(u.authority().buffer() == "user:pass@example.com:443"); assert(u.userinfo() == "user:pass"); assert(u.user() == "user"); assert(u.password() == "pass"); assert(u.host() == "example.com"); assert(u.port() == "443"); assert(u.path() == "/path/to/my-file.txt"); assert(u.query() == "id=42&name=John Doe Jingleheimer-Schmidt"); assert(u.fragment() == "page anchor");
URL paths can be further divided into path segments with the function url_view::segments
.
Although URL query strings are often used to represent key/value pairs, they
are not a compound element because this interpretation is not defined by rfc3986. Users can treat
the query as a single entity. url_view
provides the function url_view::params
to extract this view of key/value pairs.
Code |
Output |
---|---|
for (auto seg: u.segments()) std::cout << seg << "\n"; std::cout << "\n"; for (auto param: u.params()) std::cout << param.key << ": " << param.value << "\n"; std::cout << "\n"; |
path to my-file.txt id: 42 name: John Doe Jingleheimer-Schmidt |
These functions return decode_view
, which are constant views
referring to sub-ranges of the underlying URL string. By simply referencing
the relevant portion of the URL string, its components can represent percent-decoded
strings without any need to allocate memory.
These functions might also return empty strings
url_view u1 = parse_uri( "http://www.example.com" ).value(); assert(u1.fragment().empty()); assert(!u1.has_fragment());
for both empty and absent components
url_view u2 = parse_uri( "http://www.example.com/#" ).value(); assert(u2.fragment().empty()); assert(u2.has_fragment());
Many components do not have corresponding functions such as has_authority
to check for their existence. This happens because some URL components are
mandatory.
When applicable, the encoded components can also be directly accessed through
a string_view
:
Code |
Output |
---|---|
std::cout << "url : " << u << "\n" "scheme : " << u.scheme() << "\n" "authority : " << u.encoded_authority() << "\n" "userinfo : " << u.encoded_userinfo() << "\n" "user : " << u.encoded_user() << "\n" "password : " << u.encoded_password() << "\n" "host : " << u.encoded_host() << "\n" "port : " << u.port() << "\n" "path : " << u.encoded_path() << "\n" "query : " << u.encoded_query() << "\n" "fragment : " << u.encoded_fragment() << "\n"; |
url : https://user:pass@example.com:443/path/to/my%2dfile.txt?id=42&name=John%20Doe+Jingleheimer%2DSchmidt#page%20anchor scheme : https authority : user:pass@example.com:443 userinfo : user:pass user : user password : pass host : example.com port : 443 path : /path/to/my%2dfile.txt query : id=42&name=John%20Doe+Jingleheimer%2DSchmidt fragment : page%20anchor |
An instance of decode_view
provides a number of functions
to persist a decoded string:
Code |
Output |
---|---|
// VFALCO?
|
id=42&name=John Doe Jingleheimer-Schmidt page anchor |
decode_view
and its decoding functions are designed to perform no memory allocations unless
the algorithm where its being used needs the result to be in another container.
The design also permits recycling objects to reuse their memory, and at least
minimize the number of allocations by deferring them until the result is in
fact needed by the application.
In the example above, the memory owned by str
can be reused to
store other results. This is also useful when manipulating URLs:
u1.set_host(u2.host());
If u2.host()
returned
a value type, then two memory allocations would be necessary for this operation.
Another common use case is converting URL path segments into filesystem paths:
Code |
Output |
---|---|
boost::filesystem::path p; for (auto seg: u.segments()) p.append(seg.begin(), seg.end()); std::cout << "path: " << p << "\n"; |
path: "path/to/my-file.txt" |
In this example, only the internal allocations of filesystem::path
need to happen. In many common use cases, no allocations are necessary at all,
such as finding the appropriate route for a URL in a web server:
auto match = []( std::vector<std::string> const& route, url_view u) { auto segs = u.segments(); if (route.size() != segs.size()) return false; return std::equal( route.begin(), route.end(), segs.begin()); };
This allows us to easily match files in the document root directory of a web server:
std::vector<std::string> route = {"community", "reviews.html"}; if (match(route, u)) { handle_route(route, u); }
For many simpler use cases, converting the view to a string might be sufficient:
auto function = [](string_view str) { std::cout << str << "\n"; };
The path and query parts of the URL are treated specially by the library. While they can be accessed as individual encoded strings, they can also be accessed through special view types.
This code calls encoded_segments
to obtain the path
segments as a container that returns encoded strings:
Code |
Output |
---|---|
segments_view segs = u.segments(); for( auto v : segs ) { std::cout << v << "\n"; } |
path to my-file.txt |
As with other url_view
functions which return encoded strings, the encoded segments container does
not allocate memory. Instead it returns views to the corresponding portions
of the underlying encoded buffer referenced by the URL.
As with other library functions, decode_view
permits accessing elements
of composed elements while avoiding memory allocations entirely:
Code |
Output |
---|---|
segments_view segs = u.segments(); for( auto v : segs ) { std::cout << v << "\n"; } |
path to my-file.txt |
params_view params_ref = u.params(); for( auto v : params_ref ) { std::cout << "key = " << v.key << ", value = " << v.value << "\n"; } |
key = id, value = 42 key = name, value = John Doe |
The library provides the containers url
and static_url
which supporting modification
of the URL contents. A url
or static_url
must be constructed from
an existing url_view
.
Unlike the url_view
,
which does not gain ownership of the underlying character buffer, the url
container uses the default allocator to control a resizable character buffer
which it owns.
url u = parse_uri( s ).value();
On the other hand, a static_url
has fixed-capacity storage
and does not require dynamic memory allocations.
static_url<1024> su = parse_uri( s ).value();
Objects of type url
are std::regular.
Similarly to built-in types, such as int
,
a url
is copyable, movable, assignable, default constructible, and equality comparable.
They support all of the inspection functions of url_view
, and also provide functions
to modify all components of the URL.
Changing the scheme is easy:
u.set_scheme( "https" );
Or we can use a predefined constant:
u.set_scheme( scheme::https ); // equivalent to u.set_scheme( "https" );
The scheme must be valid, however, or an exception is thrown. All modifying
functions perform validation on their input. Attemping to set part of the URL
to an invalid string will result in an exception. It is not possible for a
url
to hold syntactically illegal text.
Modification functions return a reference to the object, so chaining is possible:
Code |
Output |
---|---|
u.set_host_address( parse_ipv4_address( "192.168.0.1" ).value() ) .set_port( 8080 ) .remove_userinfo(); std::cout << u << "\n"; |
https://192.168.0.1:8080/path/to/my%2dfile.txt?id=42&name=John%20Doe#page%20anchor |
All non-const operations offer the strong exception safety guarantee.
The path segment and query parameter containers returned by a url
offer modifiable range functionality,
using member functions of the container:
Code |
Output |
---|---|
params_ref p = u.params(); p.replace(p.find("name"), {"name", "John Doe"}); std::cout << u << "\n"; |
https://192.168.0.1:8080/path/to/my%2dfile.txt?id=42&name=Vinnie%20Falco#page%20anchor |