<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Concepts on DRANET</title><link>https://dranet.sigs.k8s.io/docs/concepts/</link><description>Recent content in Concepts on DRANET</description><generator>Hugo</generator><language>en-us</language><atom:link href="https://dranet.sigs.k8s.io/docs/concepts/index.xml" rel="self" type="application/rss+xml"/><item><title>Linux Network Namespaces and Interfaces</title><link>https://dranet.sigs.k8s.io/docs/concepts/linux-network-interfaces/</link><pubDate>Thu, 05 Jun 2025 11:20:46 +0000</pubDate><guid>https://dranet.sigs.k8s.io/docs/concepts/linux-network-interfaces/</guid><description>&lt;p>Network namespaces create isolated network stacks, including network devices, IP addresses, routing tables, rules , &amp;hellip;
This separation is crucial for containerization.&lt;/p>
&lt;p>Network namespaces also contain network devices that &lt;a href="https://man7.org/linux/man-pages/man7/network_namespaces.7.html">can live exactly on one network
namespace&lt;/a>:&lt;/p>
&lt;blockquote>
&lt;p>physical network device can live in exactly one network
namespace. When a network namespace is freed (i.e., when the last
process in the namespace terminates), its physical network devices
are moved back to the initial network namespace (not to the
namespace of the parent of the process).&lt;/p></description></item><item><title>Making Networks Flexible</title><link>https://dranet.sigs.k8s.io/docs/concepts/flexible-networks/</link><pubDate>Thu, 05 Jun 2025 11:20:46 +0000</pubDate><guid>https://dranet.sigs.k8s.io/docs/concepts/flexible-networks/</guid><description>&lt;p>Think about how we build things. In the old days of IT, setting up a server was like building a detailed model airplane.
Every piece had a specific part number and a precise spot where it had to be glued. The network card was &lt;code>eth0&lt;/code>,
and it was &lt;em>always&lt;/em> &lt;code>eth0&lt;/code>. If that changed, things broke.&lt;/p>
&lt;p>Today, in the world of Kubernetes and the cloud, we build things more like we&amp;rsquo;re using Lego bricks.
We have a big box of resources—CPU, memory, and networking and we snap them together to build what we need,
when we need it. When we&amp;rsquo;re done, we take it apart and throw the bricks back in the box for the next project.&lt;/p></description></item><item><title>Interface Status</title><link>https://dranet.sigs.k8s.io/docs/concepts/interface-status/</link><pubDate>Sun, 25 May 2025 11:30:40 +0000</pubDate><guid>https://dranet.sigs.k8s.io/docs/concepts/interface-status/</guid><description>&lt;h3 id="understanding-interface-status-output">Understanding Interface Status Output&lt;/h3>
&lt;p>When DRANET allocates a network interface to a Pod via a &lt;code>ResourceClaim&lt;/code>, it publishes the status of the allocated device within the &lt;code>ResourceClaim&lt;/code>&amp;rsquo;s &lt;code>status&lt;/code> field. This provides crucial insights into the readiness and configuration of the network interface from a Kubernetes perspective, adhering to the standardized device status defined in &lt;a href="https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/4817-resource-claim-device-status/README.md">KEP-4817&lt;/a>.&lt;/p>
&lt;p>After a &lt;code>ResourceClaim&lt;/code> is processed and a network device is allocated, its status is reflected under &lt;code>ResourceClaim.status.devices&lt;/code>. This section contains &lt;code>conditions&lt;/code> and &lt;code>networkData&lt;/code> for each allocated device.&lt;/p></description></item><item><title>Hardware Efficiency</title><link>https://dranet.sigs.k8s.io/docs/concepts/hardware-efficiency/</link><pubDate>Sun, 25 May 2025 11:20:46 +0000</pubDate><guid>https://dranet.sigs.k8s.io/docs/concepts/hardware-efficiency/</guid><description>&lt;h2 id="scaling-out-not-just-up">Scaling Out, Not Just Up&lt;/h2>
&lt;p>The journey of computing has always been a quest for greater efficiency. From hypervisors carving up physical servers to containers offering even more granular control, the pattern is clear. Now, with AI/ML and High-Performance Computing (HPC) taking center stage, a new frontier in resource optimization is opening up, especially around specialized hardware like high-performance networking.&lt;/p>
&lt;p>This is where solutions like DRANET, a Kubernetes network driver, are making significant strides. By cleverly using Kubernetes&amp;rsquo; Dynamic Resource Allocation (DRA), DRANET offers a declarative, Kubernetes-native method to manage and assign advanced network interfaces, including those powerful RDMA-capable NICs, directly to Pods. This isn&amp;rsquo;t merely about network connectivity; it&amp;rsquo;s a more intelligent approach to utilizing the potent, and often costly, hardware that underpins today&amp;rsquo;s distributed applications.&lt;/p></description></item><item><title>RDMA</title><link>https://dranet.sigs.k8s.io/docs/concepts/rdma/</link><pubDate>Sun, 25 May 2025 11:20:46 +0000</pubDate><guid>https://dranet.sigs.k8s.io/docs/concepts/rdma/</guid><description>&lt;h2 id="understanding-rdma-components-in-linux">Understanding RDMA Components in Linux&lt;/h2>
&lt;p>RDMA (Remote Direct Memory Access) is a powerful technology enabling applications to directly read from or write to memory on a remote machine without involving the CPU, caches, or operating system of either machine during the data transfer. This achieves ultra-low latency and high throughput, making it ideal for high-performance computing (HPC), AI/ML, and storage.&lt;/p>
&lt;p>In a Linux system, the RDMA ecosystem involves several interconnected components:&lt;/p></description></item><item><title>RDMA Device Handling</title><link>https://dranet.sigs.k8s.io/docs/concepts/rdma-modes/</link><pubDate>Sun, 25 May 2025 11:20:46 +0000</pubDate><guid>https://dranet.sigs.k8s.io/docs/concepts/rdma-modes/</guid><description>&lt;p>DRANET provides robust support for Remote Direct Memory Access (RDMA) devices, essential for high-performance computing (HPC) and AI/ML workloads that require ultra-low latency communication. DRANET&amp;rsquo;s RDMA implementation intelligently handles device allocation based on the host system&amp;rsquo;s RDMA network namespace mode.&lt;/p>
&lt;h3 id="rdma-device-handling-in-dranet">RDMA Device Handling in DRANET&lt;/h3>
&lt;p>DRANET manages three primary types of RDMA-related components for Pods:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>&lt;strong>RDMA Character Devices:&lt;/strong> These are user-space interfaces (e.g., &lt;code>/dev/infiniband/uverbsN&lt;/code>, &lt;code>/dev/infiniband/rdma_cm&lt;/code>) that user applications interact with to set up RDMA resources.&lt;/p></description></item><item><title>How It Works</title><link>https://dranet.sigs.k8s.io/docs/concepts/howitworks/</link><pubDate>Thu, 19 Dec 2024 11:20:46 +0000</pubDate><guid>https://dranet.sigs.k8s.io/docs/concepts/howitworks/</guid><description>&lt;p>The networking DRA driver uses GRPC to communicate with the Kubelet via the &lt;a href="https://github.com/kubernetes/kubernetes/blob/f141907ddd89998e821eb1047885722c8ba8922b/staging/src/k8s.io/kubelet/pkg/apis/dra/v1/api.proto">DRA API&lt;/a> and the Container Runtime via &lt;a href="https://github.com/containerd/nri">NRI&lt;/a>. This architecture facilitates the supportability and reduces the complexity of the solution, it also makes it fully compatible and agnostic of the existing CNI plugins in the cluster.&lt;/p>
&lt;p>The DRA driver, once the Pod network namespaces has been created, will receive a GRPC call from the Container Runtime via NRI to execute the corresponding configuration. A more detailed diagram can be found in:&lt;/p></description></item><item><title>References</title><link>https://dranet.sigs.k8s.io/docs/concepts/references/</link><pubDate>Thu, 19 Dec 2024 11:20:46 +0000</pubDate><guid>https://dranet.sigs.k8s.io/docs/concepts/references/</guid><description>&lt;ul>
&lt;li>&lt;a href="https://dranet.sigs.k8s.io/docs/kubernetes_network_driver_model_dranet_paper.pdf">The Kubernetes Network Driver Model: A Composable Architecture for High-Performance Networking&lt;/a> - This paper introduces the Kubernetes Network Driver model and provides a detailed performance evaluation of DRANET, demonstrating significant bandwidth improvements for AI/ML workloads.&lt;/li>
&lt;/ul>
&lt;iframe src="https://dranet.sigs.k8s.io/docs/kubernetes_network_driver_model_dranet_paper.pdf" width="50%" height="400" frameborder="0" scrolling="auto" allowfullscreen="allowfullscreen">&lt;/iframe>

&lt;ul>
&lt;li>&lt;a href="https://www.youtube.com/playlist?list=PL69nYSiGNLP2E8vmnqo5MwPOY25sDWIxb">The Challenges of AI/ML Multi-Node Workloads in Kubernetes - Antonio Ojea, Google - Regular SIG Network Meeting for 2025-07-17&lt;/a>&lt;/li>
&lt;/ul>
&lt;iframe src="https://docs.google.com/presentation/d/e/2PACX-1vSBButnm46ReLtbtgBa2b4xkmr3oXEtH5yf10xsQ4fjcqF4jSOc5MzeZQUS02Ev2j6DKFj8vQAjCIoy/pubembed?start=true&amp;loop=true&amp;delayms=3000" frameborder="0" width="480" height="299" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true">&lt;/iframe>
&lt;ul>
&lt;li>&lt;a href="https://docs.google.com/presentation/d/1Vdr7BhbYXeWjwmLjGmqnUkvJr_eOUdU0x-JxfXWxUT8/edit?usp=sharing">Kubernetes Network Drivers, Antonio Ojea, Presentation&lt;/a>&lt;/li>
&lt;/ul>
&lt;iframe src="https://docs.google.com/presentation/d/e/2PACX-1vRVritcaQFYkvaPuTPsxkgOt0ZfWhqYPcCjNN0UgZcEh9HR1yh3bFDXSOiPbPUayoMzbefZ_qvFoWCX/pubembed?start=true&amp;loop=true&amp;delayms=3000" frameborder="0" width="480" height="299" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true">&lt;/iframe>
&lt;ul>
&lt;li>
&lt;p>&lt;a href="https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/3063-dynamic-resource-allocation/README.md">KEP 3063 - Dynamic Resource Allocation #306&lt;/a>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="https://github.com/kubernetes/enhancements/issues/4381">KEP 3695 - DRA: structured parameters #438&lt;/a>&lt;/p></description></item></channel></rss>