Easiest is to put the cap on the 12V side, that means that your ESP has time to do its thing until the capacitor discharges to 3.3V plus whatever overhead the LDO requires. Assuming that is 1.2V (which is typical for a LM1117), your device will still work when your cap discharges from 12V to 4.5V.
The capacitor equation is I=C(dv/dt), the current is the capacitance times the change in voltage, in volts per second. The change in voltage here is (12-4.5=)7.5V, and we want this to happen in 1 second (the time the ESP needs to stay alive). The current is 0.25A. In other words: 0.25A=C*7.5V/s meaning C is 0.25/0.75=33000uF.
Note that this assumes a LDO, which would burn away a fair bit of current. You could get away with a smaller cap if your circuit uses a buck converter. In that case, the current on the 12V side isn't constant, as the buck converter varies it to make sure the power on the input is the same as the power on the output, minus losses. Assuming losses are like 20% or so (most buck converters are more efficient), your ESP will pull (0.25A*3.3V)/0.8=1W on the 12V side of things.
According to
here, the capacitor needed in this case is C=(2*P*t)/(V0^2-V1^2), which here comes down to (2*1W*1sec)/(12^2-4.5^2)=16000uF.
Note that it's a bit hard to find capacitors that high, but you can parallel up capacitors and add their capacitance. For instance, you could get two 10000uF/16V capacitors and parallel them up. Do note that a capacitance this high may lead to inrush current issues when the 12V is turned on, you may need to take measures for that.