current position:Home>Shallow understanding of HTTP

Shallow understanding of HTTP

2022-04-29 09:19:14from_ the_ star

The main content of this paper is based on HTTP1.1.

HTTP brief introduction

In order to understand HTTP(HyperText Transfer Protocol, Hypertext transfer protocol ), We need to go from TCP/IP Starting with ,HTTP It's a subset of it .

TCP/IP(Transmission Control Protocol/Internet Protocol, Transmission control protocol / Internet protocol ) It refers to the protocol cluster that can realize information transmission among multiple different networks .TCP/IP The agreement doesn't just mean TCP and IP Two agreements , It refers to a by FTP、SMTP、TCP、UDP、IP And so on , Just because of TCP/IP Agreement TCP The protocol and IP The agreement is most representative , So it's called TCP/IP agreement .

stay TCP/IP There are three protocols in the protocol cluster and HTTP An inseparable agreement :IP,TCP,DNS.

  • IP Protocol is the protocol of information transmission between networks , It can be done by IP Routing will IP Packets from the source device ( For example, the user's computer ) To the destination device ( It has a unique IP Address ).
  • DNS The protocol provides search by domain name IP Address .
  • TCP Transmission control protocol , It's a connection oriented 、 reliable 、 Communication protocol based on byte stream . Simply speaking TCP There is a confirmation mechanism UDP agreement , Every packet sent requires confirmation , If a packet is lost , You can't receive confirmation , The sender has to resend the packet .

Now let's get to know IP agreement 、TCP The protocol and DNS The service is using HTTP What role does the communication process of the protocol play :
 Insert picture description here

HTTP message

HTTP Provisions of the agreement , The request is sent from the client as a request , Finally, the server responds to the request and returns as a response . So the above figure is the request process ,HTTP A message is a request message , The response process is similar , What the server sends is called a response message .HTTP The agreement is through HTTP Message to exchange information .

HTTP A message is a string text consisting of multiple lines of data . It is roughly divided into message header 、 Blank line ( Division function )、 Message body ( Not necessarily ).

The first part of the message is :

  1. Request line : Request method , request URL,HTTP edition .
    GET HTTP/1.1
  2. Status line : The status code of the response result , Reason phrases and HTTP edition .
    HTTP/1.1 200 OK
  3. Header field : The first part of ge 、 Ask for the first 、 Response head 、 Entity first .

Request method

 Insert picture description here
side effect : When you send a request , Resources have not changed , It can be called no side effects ;
idempotent : When you submit M Once again N Time , The state of resources has not changed , It can be called idempotent .
1. GET: No side effect , idempotent
2. POST: Have side effects , Non idempotent
3. PUT: Have side effects , idempotent
4. DELETE: Have side effects , idempotent
5. HEAD: No side effect , idempotent

Status code

The status code is in the response message , Describe the result of the returned request . With the help of the status code, you can know that the service has handled the request normally , There are still mistakes .

1XX: Informational status code

2XX: Success status code

200: The request is successful
204: Request processed successfully , No resources to return
206: The range request header sent by the client Range Grab part of the data of the resource

3XX: Redirect the status code

301: Permanent redirection . The status code indicates that the requested resource has been assigned a new URL, In the future, we should use what resources refer to now URL.
302: Temporary redirection . The status code indicates that the requested resource has been assigned a new URI, Hope users ( This time ) Can use the new URI visit .
304: File not modified , There are cache resources in the browser , Use the cache .

4XX: Client error status code

400: Client request syntax error , The server cannot understand
401: The request is not authorized
403: Access is blocked by the server , There will be instructions in the return message
404: Requested resource does not exist

5XX: Server error status code

500: An unknown error has occurred on the server , Unable to process request
503: The server is temporarily overloaded or down , It may return to normal later

Header field

Common header fields

The first part used by both the request message and the response message .
 Insert picture description here
Connection: Control is not in the first field forwarded to the agent , Manage persistent connections ( The persistent connection is turned on by default )

Request first field

The header used when requesting a message . Added to the request 、 Client information 、 Response content related priority information .
 Insert picture description here
Host: The only header field that must be included in the request . When a request is sent to the server , The host name in the request will use IP Address direct replacement solution . But if this time , same IP Multiple domain names are deployed and run under the address , Then the server will not understand which domain name corresponds to the request . therefore , You need to use the header field Host To specify the host name of the request . If the server does not have a host name , Then just send a null value .

Response first field

The header used in response to the message . Added additional content to the response , The client will also be asked to attach additional content information .
 Insert picture description here

Entity header field

The header used for the entity part of the request message and the response message . Added information related to entities such as the update time of resource content . Insert picture description here

by Cookie The header of the field  Insert picture description here

 Insert picture description here

The response message will be sent in Set-Cookie Fields that the server needs the client to save cookie Information , When sending information to the server again , Will automatically bring cookie Field , Will last time cookie Bring the information .cookie Recently, there have been new SameSite attribute :

1. Strict: The strictest , Third parties... Are totally prohibited cookie
2. Lax: The default is this , link , Preload request ,GET  Forms can use third parties cookie
3. None: You can use a third party cookie, The premise is that you have to set Secure attribute (Cookie  Only through  HTTPS  Protocol delivery ), Otherwise it will not work .

Client cache

Let's talk about caching based on these header fields . The client will cache the request results according to the cache status of the response message , Cache is divided into mandatory cache and negotiation cache .

The client will first judge whether the resource expires according to the forced cache , If it expires, it is forced to invalidate the cache .
Then conduct negotiation cache judgment , With response message Etag and Last-Modified Value , Change the field name to If-None-Match and If-Modified-Since, To the server .
The server compares the update time of its own resources , No update returned 304, An update returns a new resource .

  • Mandatory cache
    priority :Cache -Control > Expires
    Expires Entity header field , The server returns the request result cache expiration time
    Cache-Control Common header fields , Control the behavior of the cache

    • public( client Proxy servers can cache )
    • private: Client only
    • no-cache: cache , Negotiate cache decisions
    • no-store: Don't cache
    • max-age=xxx Relative value
  • Negotiate the cache
    priority :Etag/If-None-Match > Last-Modified/If-Modified-Since
    Etag Response first field . ETag Value , The default is the index section of the file (INode), size (Size) And last modified (MTime) Conduct Hash Later obtained .
    If-None-Match Request first field . The browser found that the response header contained Etag, Then the request header will be taken when requesting from the server again if-none-match( The value is Etag Value ). The server receives a request to compare , Decided to return 200 or 304.
    Last-Modified Entity header field . The browser sends the last modification time of the resource to the server
    If-Modified-Since Request first field . The browser found that the response header has Last-Modified Statement , Then ask the server again with the header if-modified-since, Indicates request time .

HTTP The upgrade


HTTP It does not have encryption function , Send in clear text , So his content may be bugged .
And do not verify the identity of the communicating party , You may encounter camouflage .
Can't prove the integrity of the message , May be tampered with halfway .

Based on the above shortcomings ,HTTPS Born in the sky .HTTPS It's not a new protocol for the application layer .HTTPS It is a transmission protocol for computer network communication , Actually using HTTP complete , But use SSL/TLS To encrypt communication packets . Use certificate + Digital signature method to solve .
1. The service side from CA( A trusted third party organization ) Apply for a certificate and send your own public key , CA Choose a set of encryption algorithms and HASH Algorithm , use CA The private key signs and digitally signs the public key of the server and issues the public key certificate
2. The server establishes a connection with the client , Send Certificate
3. Client confirmation CA It is a legal structure ( Certificate manager interface of the system ), utilize CA Verify the digital signature on the certificate with the public key , And other information , The certificate has not expired 、 The access domain name is the same as the certificate domain name , So you SSL Certificate validation successful
4. The browser will use CA Encryption algorithm and HASH Algorithm , Sir, the password into a string of random numbers , And encrypt with the public key of the server provided in the certificate , To the server
5. The server decrypts with the private key , Later, the communication between the two will be encrypted with this password , Turn on symmetric encryption .
In the above process , When sending data, an additional method called MAC(MessageAuthentication Code) Message summary of .MAC Be able to check whether the message has been tampered with , So as to protect the integrity of the message .


HTTP1.1 The problem of :

  1. Thread blockage : One TCP Only one request can be made ;
  2. In order to speed up , Over reliance on multiple TCP Connection concurrent request link , But build TCP Link costs are high ;
  3. Head redundancy , Sending the same header to each other every time causes waste , And the format is text ;
  4. The server cannot actively send information to the client .

HTTP2.0 What has been improved

  1. New binary format : More robust
  2. Head header Compress
  3. Server push
  4. Multiplexing : That is, connect to share . every last request They all correspond to one id, There are multiple... On such a connection request, Each connected request Can be randomly mixed together , According to request Of id Assign it to different server requests .

Reference books

  1. HTTP Authoritative guide

copyright notice
author[from_ the_ star],Please bring the original link to reprint, thank you.

Random recommended