logo

drewdevault.com

[mirror] blog and personal website of Drew DeVault git clone https://hacktivis.me/git/mirror/drewdevault.com.git

A-broad-intro-to-networking.md (13070B)


  1. ---
  2. date: 2016-12-06
  3. # vim: tw=80 :
  4. layout: post
  5. title: A broad intro to networking
  6. tags: [networking, instructional]
  7. ---
  8. Disclaimer: I am not a network engineer. That's the point of this blog post,
  9. though - I want to share with non-networking people enough information about
  10. networking to get by. Hopefully by the end of this post you'll know enough about
  11. networking to keep up with a conversation on networking, or know what to search
  12. for when something breaks, or know what tech to research more in-depth when you
  13. are putting together something new.
  14. ## Layers
  15. The **OSI model** is the standard model we describe networks with. There are 7
  16. **layers**:
  17. Layer 1, the physical layer, is the electrical engineering stuff.
  18. Layer 2, the link layer, is how devices talk to each other.
  19. Layer 3, the network layer, is what they talk about.
  20. Layer 4, the transport layer, is where things like TCP and UDP live.
  21. Layers 5 and 6 aren't very important.
  22. Layer 7, the application layer, is where Minecraft lives.
  23. When you hear some security guy talking about a "layer 7 attack", he's
  24. talking about a attack that focuses on flaws in the application layer. In
  25. practice that means i.e. flooding the server with HTTP requests.
  26. ## 1: Physical Layer
  27. *Generally implemented by matter*
  28. Layer 1 is the hardware of a network. Commonly you'll find things here like your
  29. computer's **NIC** (network interface controller), aka the network interface or
  30. just the interface, which is the bit of silicon in your PC that you plug network
  31. cables or WiFi signals into.
  32. On Linux, network interfaces are assigned names like *eth0* or *eno1*. eth0 is
  33. the traditional name for the 0th wired network interface. eno1 is the newer
  34. "consistent network device naming" format popularized by tools like udev (which
  35. manages hardware on many Linux systems) - this is a deterministic name based on
  36. your network hardware, and won't change if you add more interfaces. You can
  37. manage your interfaces with the *ip* command (`man 8 ip`), or the now-deprecated
  38. *ifconfig* command. Some non-Linux Unix systems have not deprecated ifconfig.
  39. This layer also has ownership over **MAC addresses**, in theory. A MAC address
  40. is an allegedly unique identifier for a network device. In practice, software
  41. at higher layers can use whatever MAC address they want. You can change your MAC
  42. address with the ip command, which is often useful for dealing with annoying
  43. public WiFi resource limits or for frustrating someone else on the network.
  44. Other things you find at layer 1 include **switches**, which do network
  45. multiplexing (they generally can be thought of as networking's version of a
  46. power strip - they turn one Ethernet port into many). Also common are
  47. **routers**, whose behaviors are better explained in other layers. You also have
  48. hardware like **firewalls**, which filter network traffic, and **load
  49. balancers**, which distribute a load among several nodes. Both firewalls and
  50. load balancers can be done in software, depending on your needs.
  51. ## 2: Data link layer
  52. *Generally implemented by network hardware*
  53. At this layer you have protocols that cover how nodes talk to one another. Here
  54. the **ethernet** protocol is almost certainly the most common - the protocol
  55. that goes over your network cables. Said network cables are probably **Cat 5**
  56. cables, or "category 5" cables.
  57. Other protocols here include tunnels, which allow you to indirectly access a
  58. network. A common example is a **VPN**, or virtual private network, which allows
  59. you to participate in another network remotely. Tunnels can also be useful for
  60. getting around firewalls, or for setting up a secure means to access resources
  61. on another network.
  62. ## 3: Network layer
  63. *Generally implemented by the kernel*
  64. As a software guy, this is where the fun really starts. The other layers are how
  65. computers talk to each other - this layer is what they talk about. Computers are
  66. often connected via a **LAN**, or local area network - a *local* network of
  67. computers. Computers are also often connected to a **WAN**, or wide area
  68. network - the internet is one such network.
  69. The most common protocol at this layer is IP, or Internet Protocol. There are
  70. two versions that matter: IPv4, and IPv6. Both of them use **IP addresses** to
  71. identify nodes on their networks, and they carry **packets** between them. The
  72. major difference between IPv4 and IPv6 is the size of their respective **address
  73. spaces**. IPv4 uses 32 bit addresses, supporting a total of 4.3 billion possible
  74. addresses, which on the public internet are quickly becoming a sparse resource.
  75. IPv6 uses 128-bit addresses, which allows for a zillion unique addresses.
  76. Ranges of IP addresses can be described with a **subnet mask**. Such a range of
  77. IP addresses constitutes a **subnetwork**, or subnet. Though you're probably
  78. used to seeing an IPv4 address encoded like `10.20.30.40`, remember that it can
  79. also just be represented as one 32-bit number - in this case 169090600, or
  80. 0xA141E28, and you can do bitwise math against these numbers. You generally
  81. represent a subnet with CIDR notation, such as `192.168.1.0/24`. In this case, the
  82. first 24 bits are meaningful, and all possible values for the remaining 8 bits
  83. constitute the range of addresses represented by this mask.
  84. IPv4 has several subnets reserved for this and that. Some important ones are:
  85. * `0.0.0.0/8` - current network. On many systems, you can treat `0.0.0.0` as all
  86. IP addresses assigned to your device
  87. * `127.0.0.0/8` - loopback network. These addresses refer to yourself.
  88. * `10.0.0.0/8`, `172.16.0.0/12`, and `192.168.0.0/16` are reserved for private
  89. networks - you can allocate these addresses on a LAN.
  90. An IPv4 packet includes, among other things: a **time to live**, or TTL, which
  91. limits how long the packet can live for; the **protocol**, such as TCP; the
  92. **source** and **destination** addresses; a header checksum; and the
  93. **payload**, which is specific to the higher level protocol in use.
  94. Given the limited size of the IPv4 space, most networks are designed with an
  95. isolated LAN that uses **NAT**, or network address translation, to translate IP
  96. addresses from the WAN. Basically, a router or similar component will translate
  97. internal IP addresses (allocated from the private subnets) to its own external
  98. IP address, and vice versa, when passing communications along to the WAN. With
  99. IPv6 there are so many IP addresses that you don't need to use NAT. If you're
  100. wondering whether or not we'll ever run out of IPv6 addresses - leave that to
  101. someone else to solve tens of millions of years from now.
  102. IPv6 addresses are 128-bits long and are described with strings like
  103. `2001:0db8:0000:0000:0000:ff00:0042:8329`. Luckily the people who designed it
  104. were kind enough to realize people don't want to write that, so it can be
  105. shortened to `2001:db8::ff00:42:8329` by removing leading zeros and removing
  106. sections entirely composed of zeros. Where colons are reserved for another
  107. purpose, you'll typically add brackets around the IPv6 address, such as
  108. `http://[2607:f8b0:400d:c03::64]`. The IPv6 loopback address (localhost) is
  109. `::1`, and IPv6 subnets are written the same way as in IPv4. Given how many
  110. IPv6 addresses there are, it's common to be allocated lots of them in cases when
  111. you might have expected to only receive one IPv4 address. Typically these blocks
  112. will be anywhere from /48 to /56 - which contains more addresses than the entire
  113. IPv4 space.
  114. IP addresses are often **static**, which means the node connecting to the
  115. network already knows its IP address and starts using it right away. They may
  116. also be **dynamic**, and are allocated by some computer on the network with the
  117. **DHCP** protocol.
  118. IPsec also lives in layer 3.
  119. ## 4: Transport Layer
  120. *Generally implemented by the kernel*
  121. The transport layer is where you have higher level protocols, through which much
  122. of the work gets done. Protocols here include TCP, UDP, ICMP (used for ping),
  123. and others. These protocols are used to power application-layer protocols.
  124. **TCP**, or the transmission control protocol, is probably the most popular
  125. transport layer protocol out there. It turns the unreliable internet protocol
  126. into a reliable byte stream. TCP (tries to) make four major guarantees: data
  127. will arrive, will arrive exactly once, will arrive in the correct order, and
  128. will be the correct data.
  129. TCP takes a stream of bytes and breaks it up into **segments**. Each segment is
  130. then stuck into an IP packet and sent on its way. A TCP segment includes the
  131. source and destination **ports**, which are used to distinguish between
  132. different application-layer protocols in use and to distinguish between
  133. different applications using the protocol on the same host; a **sequence
  134. number**, which is used to order the packet; an **ACK number**, which is used to
  135. inform the other end that it has received some packet and it can stop retrying;
  136. a checksum; and the data itself. The protocol also includes a handshake process
  137. and other housekeeping processes that the application needn't be aware of.
  138. Generally speaking, the overhead of TCP is significant for real-time
  139. applications.
  140. Most TCP servers will **bind** to a certain port to **listen** for incoming
  141. connections, via the operating system's **socket** implementation. Many TCP
  142. **clients** can connect to one server.
  143. Ports are a 16 bit unsigned integer. Most applications have a default port
  144. they're known to use, such as 80 for HTTP. Originally these numbers were
  145. allocated by the internet police, but this has fallen out of practice. On most
  146. systems, ports less than 1024 require elevated permissions to listen to.
  147. **UDP**, or the user datagram protocol, is the second most popular transport
  148. layer protocol, and is the lighter of the two. UDP is a paper thin layer on top
  149. of IP. A UDP packet contains a source port, destination port, checksum, and a
  150. payload. This protocol is fast and lightweight, but makes none of the promises
  151. TCP makes - UDP "**datagrams**" may arrive multiple or zero times, in a
  152. different order than they were sent, and possibly with data errors. Many people
  153. who use UDP will implement these guarantees themselves in a some lighter-weight
  154. fashion than TCP. Importantly, UDP source IPs can be spoofed and the destination
  155. has no means of knowing where it really came from - TCP avoids this by doing a
  156. handshake before exchanging any data.
  157. UDP can also issue broadcasts, which are datagrams that are sent to every node
  158. on the network. Such datagrams should be addressed to `255.255.255.255`. There's
  159. also multicast, which specifies a subset of all nodes to send the datagram to.
  160. Note that both of these have limited support in real-world networks.
  161. ## 5 & 6: Session and presentation
  162. Think of these as extensions of layer 7, the application layer. Technically
  163. things like SSL, compression, etc are done here, but in practice it doesn't
  164. have any important technical implications.
  165. ## 7: Application layer
  166. *Generally implemented by end-user software*
  167. The application layer is the uppermost layer of the network and it's what all
  168. the other layers are there for. At this layer you have all of the hundreds of
  169. thousands of application-specific protocols out there.
  170. **DNS**, or the domain name system, is a protocol for mapping domain names (i.e.
  171. google.com) to IP addresses (i.e. 209.85.201.100), among other features. DNS
  172. servers keep track of DNS records, which associate names with records of various
  173. types. Common records include A, which maps a name to an IPv4 address, AAAA for
  174. IPv6, CNAME for aliases, and MX for email records. The most popular DNS server
  175. is bind, which you can run on your own network to operate a private name system.
  176. Some other UDP protocols: NTP, the network time protocol; DHCP, which assigns
  177. dynamic IP addresses on networks; and nearly all real-time video and audio
  178. streaming protocols (like VoIP). Many video games also use UDP for their
  179. multiplayer networking.
  180. TCP is more popular than UDP and powers many, many, many applications, due
  181. largely to the fact that it simplifies the complex intricacies of networking.
  182. You're probably familiar with HTTP, which is used by web browsers use to fetch
  183. resources. Email applications often communicate over TCP with IMAP to retrieve
  184. the contents of your inbox, and SMTP to send emails to other servers. SSH (the
  185. secure shell), FTP (file transfer protocol), IRC (internet relay chat), and
  186. countless other protocols also use TCP.
  187. - - -
  188. Hopefully this article helps you gain a general understanding of how computers
  189. talk to each other. In my own experience, I've used a broad understanding of the
  190. entire stack and a deep understanding of levels 3 and up. I expect most
  191. programmers today need a broad understanding of the entire stack and a deep
  192. understanding of level 7, and I hope that most programmers would seek a deep
  193. understanding of level 4 as well.
  194. Please leave some feedback if you appreciated this article - I may do more
  195. similar articles in the future, giving a broad introduction to other topics. The
  196. next topics I have in mind are security and encryption (as separate posts).