On the EnvironmentWebcam component, you need to assign a Render Target that comes from your webcam.
So the steps are roughly:
Set up a webcam capture in Unreal that outputs to a Render Target (there are several tutorials online for “Unreal webcam to render target” you can follow).
Take that Render Target and plug it into the Render Target field of the EnvironmentWebcam component.
Once the webcam feed is correctly bound to that Render Target, Convai Vision will use your webcam input.