The P5P Protocol

Abstract
This is the specification of the "Pretty Please Platform for Participating Publishers" (P5P). This document, along with its normative references, includes all the specification necessary for the implementation of interoperable P5P clients and servers.

Introduction
"Pretty Please Platform for Participating Publishers" (P5P) is a modest proposal that enables users to express their privacy and other preferences in a standard format that can be retrieved automatically and interpreted easily by web publishers.

P5P provides an opt-in mechanism for a user to convey their potentially intricate personal preferences to a publisher website. The protocol defines a large set of tokens that the user may express that they wish the web publisher to respect.

P5P-compatible sites will be notified of the user's personal preferences, and may automate their behavior based on these preferences as appropriate. Thus, each individual server need not query the user for preferences manually at each visit and may instead automatically determine the user's preferences. The user may change their mind at any time and adjust preferences in a single location.

Although P5P provides a technical mechanism for enabling websites to be aware of their visitors' preferences, there is of course no technical means by which a web client can enforce or verify that the server actually respects the preferences expressed by the client. Therefore, the use of the standard "pretty please" syntax helps ensure that the web publisher understands that the users' preferences are strongly held and they should feel very badly if those preferences are not respected.

Terminology
The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this document, are to be interpreted as described in RFC2119 [0].

Status of this Document
This document is expected to progress according to the process described in RFC 1438 [1].

Determining the User's P5P Preferences
In the P5P System, the user's browser expresses their preferences using the P5P HTTP Request header.

Naturally, adding hundreds of P5P tokens to outbound requests would negatively impact browsers' investments in network performance [7]. Therefore, the P5P request header may contain either a short list of compact P5P tokens, or may contain a reference URL that points to a policy the user has created or has selected from a gallery of such policies authored by experts. It is expected that privacy wonks and other special interest communities will publish "reference" statements that users may select from.

Example #1:

P5P: NO-TRACK, PINKY-SWEAR

...specifies that the server should not track the user. The PINKY-SWEAR token is described in the Policy Tokens section below.

Example #2:

P5P: http://example.com/I-Love-Kittens.p5p

...specifies that the server should download and parse the specified P5P document from the provided URL. The P5P statement at this location contains preferences of interest to cat lovers; see Policy Tokens below.

If the web server receives multiple P5P statements, it SHOULD construct the user's preferences as the union of the statements. In the event of flat-out contradictions in the user's preferences, the server MAY refuse to return a response and provide a HTTP/406 Not Acceptable response indicating that the server entity cannot satisfy the client's request conditions.

Policy Tokens
Tokens consist of a simple case-insensitive sequence of ASCII characters from the set ['-', 'A'-'z']. Tokens that require a value will follow the token with a single equal sign and a quoted value string. If a token value is omitted, the implicit default value of "true" MUST be assumed.

Offering a rich set of policy tokens is critical to the success of the P5P specification. This version of the specification introduces a core set of preference tokens that the user may specify, but it is expected that future versions of this specification will include additional tokens as the need for such tokens is expressed by users. As you can imagine, there are many pieces of valuable metadata that the user may share. We expect several large social networking sites will be eager to participate in this system and will offer to auto-generate P5P declarations based on the private information they have aggregated over their user's thousands of actions.

The tokens defined in this version of the specification are as follows:

NO-TRACK
The client does not wish to be tracked. The server SHOULD interpret this token in the same manner as the Do-Not-Track header (DNT: 1).

Servers MUST immediately discontinue the use of RFC3751 OP [20] if this directive is specified by the client.

OSTRICH-MODE-PRIVACY
The client indicates that tracking is acceptable, but only if they can see the tracker. Servers MAY track the user using visible content and MUST NOT track the user using any invisible tracking pixels, tracking JavaScript, reverse-IP-tracking, or any backend process.

NO-EVIL
The client does not want the server to do anything evil in response to the HTTP request. The scope of "evil" may vary by user and hence clients SHOULD attempt to provide more specific directives as well. The server SHOULD attempt to do as little evil as is required to service the user's request.

Note: The server MUST NOT return any response to the client if the generation of that response involves any TCP/IP connection with the RFC3514 evil bit set [10].

NO-PHISH
The NO-PHISH directive indicates that the client would prefer not to see any socially engineered content soliciting the user's credentials or other personal information. This mechanism is considerably simpler and less expensive than building a real-time reputation service [8] and thus can be used by any browser.

Note: Do not confuse with NO-FISH.

NO-FISH
The NO-FISH directive indicates that the client would prefer not to view any more hardware accelerated demos involving swimming beta-fish, e.g. [9]. Web publishers may respond by instead showing another animal or by presenting the user with a survey inquiring as to how the user developed animosity toward aquatic life.

Note: Do not confuse with NO-PHISH.

NO-POPUPS
Indicates that the user would prefer not to receive any form of popup window, no matter what wonderful and unique opportunity is offered in such a window.

Note: Early field data suggests that this preference is universally held.

NO-ADS-IM-SURE-YOU-WILL-FIGURE-OUT-ANOTHER-BUSINESS-MODEL
Indicates that the user does not wish to be shown any form of advertising content, and expresses their earnest belief that the web publisher will find some way to remain in business without an income stream.

Note: This preference appears to be far more common than the related NO-ADS-HERE-IS-MY-BILLING-INFO-INSTEAD value.

NO-CELEBRITY-GOSSIP
Indicates that the user is not a fan of celebrity gossip and thus servers SHOULD refrain from serving content in this theme.

Note: Clients sending this token may find that many "news" sites appear to be blank.

SERVE-ME-ADS-BASED-ON-MY-SELF-IMAGE-RATHER-THAN-MY-BEHAVIOR
Indicates that the web publisher should select relevant advertising based on the user's self-expressed image (contained elsewhere in the P5P declaration) rather than based on analysis of the user's behavior on the web.

For instance, if the user combines this preference with the NO-CELEBRITY-GOSSIP preference, they should not be served advertisements associated with celebrities, even after issuing hourly search queries for "brangelina".

MOOD=value
Expresses the user's current mood, utilizing the values defined in RFC5841 [22].

The server SHOULD react in a sympathetic way and may infer other P5P directives in response to the specified mood.

As mood is often ephemeral and not stable over long periods of time, client software SHOULD make configuration of this flag simple and MAY infer it from observations of the user.

CAT-PERSON
Indicates that the user is a "Cat person" and site themes and stories should be adjusted accordingly.

DOG-PERSON
Indicates that the user is a "Dog person" and site themes and stories should be adjusted accordingly.

IS-A-DOG
Indicates that the user is actually a canine. This token satisfies a longstanding limitation of the Internet [11].

Note: An optional value token allows specification of the visitor's breed; if missing, the default value "mutt" is assumed.

USER-MIME-TYPE=value
Provides attributes describing the user's physical traits in the format specified by RFC1437 [17].

SORRY-I-AM-BROKE-RIGHT-NOW
The user expresses that they currently lack funds and appeals for donations or other financing cannot be accommodated at this time.

Internet Merchants MAY elect to offer special "zero-down financing" offers. Advertisers SHOULD avoid advertising luxury items to clients that send this preference as these may be offensive to the user.

SIGN=value
Indicates the user's astrological sign. This information may prove very useful in allowing sites to tailor their content to the user's temperament and outlook on life. Web publishers MAY correlate this information with a 3rd-party service to further tailor their content based on the visitor's latest horoscope information.

FAVORITE-COLOR=value
The client expresses their favorite color using a HTML color string; the website SHOULD react by coloring as many UI elements as possible using colors that match or complement the specified color.

Websites SHOULD support a comma-delimited sequence of color tokens for those users who just can't settle on one favorite.

AGE=value
The user expresses their age as an integer, or a string containing a fraction.

Note: The fractional format is particularly popular for those under age 12. "Ten and three quarters" conveys important additional status amongst the user's peer group.

ACTUALAGE=value
The user expresses their *true* age accurately as an integer.

Note: Field research suggests that as ACTUALAGE is lower than AGE until the ACTUALAGE value reaches 21. Often AGE=ACTUALAGE from 21-28, then AGE remains relatively constant despite increasing ACTUALAGE.

DONT-TELL-MOM
The user expresses that they would like the server to help them avoid any sequence of events that would result in their mother becoming aware of their web traffic.

Note: This token appears to be popular even when ACTUALAGE exceeds the age of majority for a given locale.

I-AM-SCHMIDT
Indicates that user is willing for the web publisher to track and share their behavior with any other party. The user is thus indicating that they would never be doing anything that they wouldn't want everyone to know about [27].

DO-NOT-SUE
Indicates that the client would prefer that the server fail to return a response if the receipt of the response would put the client in legal jeopardy (e.g. due to intellectual property concerns).

DO-NOT-CENSOR
Indicates that the client should be exempted from all forms of censorship. Web publishers MAY ignore this preference if the client expresses an IS-A-MINOR value which indicates that the user has not reached the age of majority.

Note that intermediaries SHOULD leave this token intact in the client's request and SHOULD NOT otherwise interfere with the request or response. It is hoped that the Pretty Please token is sufficient to indicate to any intermediaries that the client bears no nefarious intent.

I-OWN-MY-WORDS
With this token, the user asserts ownership of the content they have submitted to the server, specifically prohibiting its reuse for any other purpose.

Servers may implement this preference using [24] as it is expected that this mechanism will provide an equivalent level of protection.

FOSS-ONLY
The user expresses that they are only willing to accept responses generated purely via Free and Open Source hardware and software stacks. Servers, routers and gateways SHOULD not satisfy any request if any proprietary hardware or software is needed to generate a response.

NO-SHINY
The user expresses that they are unwilling to accept responses that are built upon Rich Internet Application frameworks like Adobe Flash.

Note: This preference is sometimes indicated by users but is more commonly imposed due to hardware lock-in limitations enforced by certain vendors.

OPEN-CONTENT-ONLY
The user expresses that they are only willing to view content which is freely available via a suitable Creative Commons license. If no such content is available, the server SHOULD NOT return anything. The server MAY instead return content unrelated to the user's request that satisfies the user's license demands.

GEEZ-I-AM-AT-WORK
The user expresses that they are at work and only wish to view content which is appropriate for that environment.

All NSFW content should be suppressed by the web publisher.

Note: This may result in many web pages appearing completely blank.

YOUTUBE-YOU-MAY-THINK-YOU-ARE-FUNNY-BUT-YOU-ARE-NOT
Suggests that a popular video sharing site stop generating a phony context menu that overrides the default Save Video option in their HTML5 video player [28] such that user is "rick-rolled" into watching a Rick Astley music video [29].

INTERNETISSERIOUSBUSINESS
Indicates that the user believes that the Internet should be used for serious business and tomfoolery should be limited. In particular, users sending this token should not be served novelty RFCs published on the first day of the fourth month or similarly silly content.

ICANHAZCHEEZBURGER
Non-intuitively, indicates that the user wishes to be shown pictures of cats arranged in unlikely circumstances. Ideally, images should contain mildly amusing captions and be suitable for sharing with others.

ICANHAZCAT
Non-intuitively, indicates that the user wishes to be shown pictures of ground beef products arranged in unlikely circumstances. Ideally, images should contain mildly amusing captions and be suitable for sharing with others.

NO-MEMES
Indicates that the user is not a follower of Internet-spread memes and thus would prefer not to read content with oblique references to such memes. Servers SHOULD avoid referencing memes as they won't make any sense to the user and MAY lead to confusion or other unhappy feelings.

NO-MIMES
Indicates that the user would prefer to avoid images or video content of mimes, feeling that such performers are "just creepy." Servers SHOULD avoid displaying any mimes to avoid giving the user nightmares.

Note: This token has nothing to do with Multipurpose Internet Mail Extensions [31].

IS-A-MINOR=value
A "true" value indicates that the user has not attained the age of majority within their locale and thus is not yet an adult. A "false" value indicates that the user has reached the age of majority and thus may be shown adult content.

Note: The GEEZ-I-AM-AT-WORK token, if present, should be respected even if IS-A-MINOR="false".

Note: Do not confuse with the unrelated IS-A-MINER token.

IS-A-MINER
Indicates that the user is charged with extracting elements of value from subterranean locales, typically using a shovel or pick.

Websites MAY wish to react to this token by utilizing a darker color palette to avoid causing undue stress for the often light-sensitive eyes of visitors in this line of work.

Note: Do not confuse with the unrelated IS-A-MINOR token. Most children prefer bright shiny things.

NOT-PUNNY
Indicates that the user has recognized puns are the lowest form of wit, and are typically only deemed funny by their authors.

Websites SHOULD limit their use of puns and other wordplay when this token is expressed by the user.

SSN=value
A United States-based user may use this token to provide their Social Security Number.

When sending this token, clients SHOULD also send the NO-EVIL token to ensure that nothing bad happens.

DONT-TASE-ME-BRO
The user expresses the desire to avoid any form of physical violence, and specifically suggests that electric darts would not be welcomed [26].

COFFEE=value
Describes the user's coffee preferences, valid tokens include "hot", "decaf", "blank", "milk", and "iced".

Note: servers SHOULD support "decaf" even if the underlying implementation is dependent upon RFC2324 [18]

Note: this syntax is insufficiently expressive for patrons of Starbucks. For expressing the 87,000 possible configurations of Starbucks coffee, the client MAY provide a base64 encoded byte array, compressed with DEFLATE [30].

I-AM-COMFORTABLE-WITH-MY-WEIGHT-THANK-YOU
The user requests that the web publisher avoid showing weight-loss advertisements, especially those which attempt to incite self-consciousness on the part of the user.

DO-NOT-CRASH
The user requests that the website avoid crashing for the duration of the request. Web publishers should see RFC748 [12] for help in converting this directive into protocol level preferences that will help satisfy the request. The routing mechanisms defined in RFC1217 [23] are generally not suitable for satisfying requests bearing this preference.

I-WEAR-A-TINFOIL-HAT
The user expresses that they are especially concerned about information leakage and external interference with their personal thought processes.

Web publishers should see RFC1097 [13] for help in converting this directive into protocol level preferences that will help avoid inadvertent subliminal messaging.

I-AM-JUST-IMPATIENT
This token indicates that the user is inclined toward impatience and may frequently refresh the page due to the eagerness to receive new content.

Servers SHOULD NOT interpret unnaturally frequent requests bearing this token as a Denial-of-Service attack.

Note: In the wild, this token is rarely correlated with the COFFEE="decaf" token. If both tokens are present, servers MAY conclude that the request is malicious.

GO-FASTER-I-AM-IN-A-HURRY
The user expresses that they are under a time constraint and the operation should be completed as quickly as possible.

Web publishers should avoid any process which relies upon RFC1149 [14] or RFC2549 [15] in satisfying this request as these implementations are unlikely to meet the performance requirements implied by this directive. ULS technology [16] should be avoided in servicing requests with this token, particularly if the website is communicating any non-positive news. SFSS [21] may only be used if the endpoints are particularly talented and can maintain sustained throughput of 300spm; compression is highly recommended.

NO-RABBLE
The user expresses that their time is valuable and they have no desire to read remarks from anonymous internet users unless those remarks have been deemed exceptionally insightful.

Note, this preference is generally incompatible with content generated under RFC2795 [19] and similar processes in use on most Internet sites.

PINKY-SWEAR
If this token is present, the client asserts that it is providing only truthful information in the provided P5P statement. For instance, an expressed AGE value MUST equal ACTUALAGE.

Further the client demands that the server also be an honest dealer and the server MUST accommodate all of the client's preferences or it MUST return a HTTP/406 response.

Conclusion
Contrary to common belief, there is no inherent conflict between the goals and desires of web users and web publishers, content providers, tracking firms, and advertisers.

Instead, the only missing piece has been a mechanism whereby the user can clearly and unambiguously state their preferences. P5P provides that mechanism. Websites can now easily learn their users' preferences and if their default practices and goals are in conflict with a given visitor's preferences, they may easily adjust their practice to accommodate their valued guests.

Appendix 1: Relationship to other specifications

P3P
The Platform for Privacy Preferences (P3P) specification [2] provides a mechanism by which a given web publisher may advertise a legally-binding statement of their privacy practices.

Browsers may interpret the P3P tokens and restrict [3] the functionality of mechanisms useful for tracking the user if the web publisher advertises practices that are offensive to the user's sensibilities.

Unfortunately, in the face of such automated enforcement, some popular websites [4] express P3P declarations using technically legal but semantically meaningless policy statements. For example:

P3P: CP="If we told you what we do with these cookies, your browser would block them automatically, so we will not."

As the P3P specification suggests that browsers should ignore unknown tokens, the previous statement is technically a valid statement with undefined meaning.

The same is true for the following P3P declaration based upon an undocumented Jedi mind trick:

P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."

It is inelegantly left to the courts to decide whether such statements constitute falsehood by omission, in the same vein as a child who, when asked to enumerate what he ate at his friend's house, mentions the carrots but fails to mention the quart of ice cream.

P5P resolves the untrustworthy publisher problem by saying "Pretty Please." Even small children know that a request expressed with the "Pretty Please" token is much more likely to be accommodated than one that lacks this important token.

Do-Not-Track
The "Do-Not-Track" specification [25] enables a web user to communicate to the web server that tracking is not desired. The exact mechanism used for this communication is a HTTP Request header of the following form:

DNT: 1

Unlike P5P, the Do-Not-Track header does not include any "Pretty Please" tokens and thus is somewhat rude.

Advertisers have expressed the concern that the Do-Not-Track header is "like sending a smoke signal in the middle of Manhattan; it might draw a lot of attention, but no one knows how to read the message" [5]. P5P helps resolve this ambiguity by providing a rich set of policy tokens that enable the web site to understand the specific details of the user's preferences.

Web Tracking Protection

The Web Tracking Protection specification [6] describes a feature that allows a browser user to select one or more Tracking Protection Lists (TPLs) or generate their own. Each TPL expresses a set of policies as to which web addresses the browser may contact in a 3rd party context. A TPL effectively acts as a "Do-not-call" list for browser users who do not wish to send information to web publishers who may potentially track them.

One popular browser implementation uses the presence of one or more configured TPLs as a signal that it should also send the DNT: 1 header to indicate the implied user preference that they do not wish to be tracked.

Because Web Tracking Protection enables privacy using technical means rather than relying upon the goodwill of web tracking companies, it does not give the web tracking companies the opportunity to interpret the user's potentially rich set of personal preferences and react accordingly.

As Web Tracking Protection is technically enforced by client browser software, the web publisher and web tracking companies may find it difficult to successfully offer a TPL-protected web user "valuable offers" and "unique opportunities".

In addition, because TPLs can prevent a tracking or analytics company from accumulating, analyzing, and developing a per-user profile that can be sold to web publishers around the web, users may see less relevant offers and opportunities. P5P mitigates this problem by enabling a P5P user to express a rich set of information about their personal interests and preferences.

Appendix 2: International Implications
The World Wide Web is an international network that spans the world and thus can be subject to different legal and regulatory requirements in every locale. Privacy mechanisms which depend upon a specific regulatory framework can often be circumvented by placing a node within a more lax regulatory locale.

In contrast, the P5P mechanism does not depend on legal or regulatory frameworks and instead operates based on the universally-recognized value that humans bestow upon politeness.

Appendix 3: Related References

[0] http://tools.ietf.org/html/rfc2119
[1] http://tools.ietf.org/html/rfc1438
[2] http://www.w3.org/TR/P3P/
[3] http://msdn.microsoft.com/en-us/library/ms537343(v=vs.85).aspx
[4] http://www.facebook.com/help/?topic=p3p
[5] http://online.wsj.com/article/SB10001424052748704692904576166820102959428.html
[6] http://www.w3.org/Submission/web-tracking-protection/
[7] http://blogs.msdn.com/b/ie/archive/2010/07/14/caching-improvements-in-internet-explorer-9.aspx
[8] http://blogs.msdn.com/b/ie/archive/2010/12/14/enhanced-protection-with-ie9-s-smartscreen-filter.aspx
[9] http://ie.microsoft.com/testdrive/Performance/FishIETank/Default.html
[10] http://tools.ietf.org/html/rfc3514
[11] http://www.unc.edu/depts/jomc/academics/dri/idog.html
[12] http://tools.ietf.org/html/rfc748
[13] http://tools.ietf.org/html/rfc1097
[14] http://tools.ietf.org/html/rfc1149
[15] http://tools.ietf.org/html/rfc2549
[16] http://tools.ietf.org/html/rfc1216
[17] http://tools.ietf.org/html/rfc1437
[18] http://tools.ietf.org/html/rfc2324
[19] http://tools.ietf.org/html/rfc2795
[20] http://tools.ietf.org/html/rfc3751
[21] http://tools.ietf.org/html/rfc4824
[22] http://tools.ietf.org/html/rfc5841
[23] http://tools.ietf.org/html/rfc1217
[24] http://en.wikipedia.org/wiki/Bozo_bit
[25] http://tools.ietf.org/html/draft-mayer-do-not-track-00
[26] http://www.wired.com/threatlevel/2007/09/dont-tase-me-br/
[27] http://gawker.com/5419271/google-ceo-secrets-are-for-filthy-people
[28] http://www.youtube.com/html5
[29] http://www.youtube.com/watch?v=dQw4w9WgXcQ
[30] http://tools.ietf.org/html/rfc1951
[31] http://tools.ietf.org/html/rfc2046