Jump to content

Can't Get 10Gbit Working

Waffles!

Hello all, so I had followed this guide on 40Gbit networking and I was just content with it so I never bothered to do the infiniband stuff. It has been working for a long time now but I wanted to try running Ubuntu for a change. So I installed the latest Mellanox OFED linux drivers, 4.4-2.0.7 and that installed fine. The other machine has an address of 172.16.44.2, netmask of 255.255.255.0, and a gateway of 172.16.44.1. So I went into network connections and did the same thing on this machine, changed the address to 172.16.44.1, netmask to 255.255.255.0, and a gateway of 172.16.44.2. A few moments later it says it was connected and I can apparently ping 172.16.44.1 just fine. However, when I try connecting to the IP in a browser or try to add it to a map it refuses to connect. What's going on here? I thought it'd be as simple as it was on windows. What else do I have to do to get this working again?

 

Link to comment
Share on other sites

Link to post
Share on other sites

What's your network setup like?

 

Is this peer to peer (so just one cable connecting both hosts) or is there a switch involved as well?

 

In a peer to peer network, no gateway is needed.

 

Also, what kind of cable and cards are you using?

PC Specs - AMD Ryzen 7 5800X3D MSI B550M Mortar - 32GB Corsair Vengeance RGB DDR4-3600 @ CL16 - ASRock RX7800XT 660p 1TBGB & Crucial P5 1TB Fractal Define Mini C CM V750v2 - Windows 11 Pro

 

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, NelizMastr said:

What's your network setup like?

 

Is this peer to peer (so just one cable connecting both hosts) or is there a switch involved as well?

 

In a peer to peer network, no gateway is needed.

 

Also, what kind of cable and cards are you using?

Peer to peer, just like how they set it up and also using the same mellanox cards.

Just now, wojtepanik said:

dns settings?

Not set, just using static IPs to connect them.

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, Waffles! said:

Peer to peer, just like how they set it up and also using the same mellanox cards.

Not set, just using static IPs to connect them.

So you're trying to access a Samba share you made on the Ubuntu box or what? If the host you're connecting to doesn't have any services that you can reach, there's the reason you can't do anything via browser or drive mapping.

 

Can both hosts ping EACHOTHER? If so, no issue there. If a ping only works one way, firewall issue.

PC Specs - AMD Ryzen 7 5800X3D MSI B550M Mortar - 32GB Corsair Vengeance RGB DDR4-3600 @ CL16 - ASRock RX7800XT 660p 1TBGB & Crucial P5 1TB Fractal Define Mini C CM V750v2 - Windows 11 Pro

 

Link to comment
Share on other sites

Link to post
Share on other sites

33 minutes ago, NelizMastr said:

So you're trying to access a Samba share you made on the Ubuntu box or what? If the host you're connecting to doesn't have any services that you can reach, there's the reason you can't do anything via browser or drive mapping.

 

Can both hosts ping EACHOTHER? If so, no issue there. If a ping only works one way, firewall issue.

Sorry, should have been more clear. Before I was on windows and I had successfully gotten this to work with FreeNAS. If I remember correctly, all I needed to do to get both of them to work was just setting the two IPs and gateways. The only thing that's changed was that I went from windows to ubuntu. I can ping from ubuntu, no dropped packets. I don't think it's a firewall issue because UFW is inactive. Whenever I try to access the webgui via the IP it fails to connect. It also sometimes doesn't want to stay connected, just randomly connects and disconnects.

Link to comment
Share on other sites

Link to post
Share on other sites

  • 1 month later...

So...I'm not really sure what your ultimate goal or objective is other than trying to get the two systems to talk to each other.

 

If you're using Ubuntu, try this:

 

(DISCLAIMER, I'm using Mellanox ConnectX-4 4x EDR Infiniband cards (100 Gbps), so there might be some differences.)

 

1) It would be more beneficial to google how to enable the root user in Ubuntu because by default, root is disabled. You can't su to root in Ubuntu (by default) but it would be EXTERMELY beneficial to do so so that you can just fire off the commands in rapid succession rather than having to do "sudo sh -c 'command'" and HOPE that that'll work (which sometimes it does, and sometimes it doesn't.)

 

Once you've enabled the root user (and give it a password) in Ubuntu, su to root.

 

2) Since I don't really know your setup, here's what I would do (in order): (run these commands and look for the following)

 

# mst start

 

# mlxfwmanager

(to get the PCI device name because you might need it later)

 

# hca_self_test.ofed

 

You should look to see if there are any fails in the output on both systems.

 

You should check to see that the ports are up and that a link has been established. If there isn't, then we'll come back to this a little later.

 

# ibv_devinfo

 

(Again, there's a real possibility that this command may not work for you if you don't have an Infiniband device, so it might take a little bit of research/googling to find out what would be the synonymous command.)

 

# ibdev2netdev

 

This should show if you have an IP assigned to your Mellanox device/port. It should show that port as being "up" if it is working.

 

# ip link show ib0

 

# mlxconfig -d <<PCI device name>> --query

(e.g. # mlxconfig -d /dev/mst/mt4115_pciconf0 --query)

 

# mlxfwmanager -d <<PCI device name>> --query

 

So this is just some basic diagnostics on or about the card/ports themselves. Any where there's an anomaly here - and you'll have to start digging deeper. Much deeper.

 

The other thing that I would suggest is that I would set the IP address for the gateway as something other than the IP address of the other machine.

 

If I understood you correctly, your network config probably looks something like this:

 

machine1: /etc/sysconfig/network/ifcfg-eth0

BOOTPROTO='static'

BROADCAST=''

ETHTOOL_OPTIONS=''

IPADDR='172.16.44.1/24'

 

/etc/sysconfig/network/ifroute-eth0

default 172.16.44.2

 

machine2: /etc/sysconfig/network/ifcfg-eth0

BOOTPROTO='static'

BROADCAST=''

ETHTOOL_OPTIONS=''

IPADDR='172.16.44.2/24

 

/etc/sysconfig/network/ifroute-eth0

default 172.16.44.1

 

right? Something like that?

 

I would suggest changing the gateway to something like 172.16.44.100 on both machines rather than pointing the gateways towards each other. That's probably a part of your problem.

 

and then you can try and ping each other to see if it will work.

 

(This is how I have my machines set up. I have four compute nodes, all with Mellanox ConnectX-4 dual port VPI 4x EDR IB 100 Gbps cards and I have set them all up with a common gateway IPv4 address but each card/port has their own address and this is working for me.)

 

Also, in regards to opening a browser and being able to connect to the machine - there is no indication in your original question that you have some kind of webserver running (unless you're talking about FreeNAS, which I don't have any experience with it).

 

But try this first and see.

 

I'm somewhat surprised that this would have worked in Windows as well because with it having different gateways, it shouldn't have been able to see each other since it might think that they might not belong on the same network. Not really sure about that one.

 

But you can try this and see what happens.

IB >>> ETH

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×