Hmm, i remembered something...there was a system program stalling the boot a while back in certain circumstances (since then fixed).
Iirc it was the date command together with the SERVER switch to pull the current time from a timeserver address. I can see that you don't use that, but TimeGuard instead...who knows, maybe it suffers from a similar problem? So my guess would be to shut down TimeGuard first and go from there.
And at any rate, I assume TimeGuard also needs the network to be up and running, so I'd recommend not starting it from WBStartup, but instead from Network-Startup. You could e.g. put a "Wait 5 SECS" (adjust if needed) after the "; Add below this line applications that need a running network" comment line, and then start TimeGuard - if it must run from WB, start it with WBRun. Also, if it needs it, CD to its program dir before WBRunning it, and back again afterwards (just in case you add other stuff later).
This is the reason there is a Network-Startup script: To start stuff that wants a network.
Let me continue using my X1000 as usual and let's see what I get during the SS on a cold boot after leaving it all night. I will update the thread in due course.
@nbache Earlier you said that you still have this problem that I reported (or a similar problem). Is that still the case, or did you somehow fix it?
wait for the network driver to be loaded, but *not* for the network to be *up*.
You can check that with any script that access a website.
e.g. i'm starting three arexx scripts in my network-startup *after* the above line, all of them either access a website or an address on my local home network. While latter works, the website wasn't reachable in 3 out of 10 cases and eventually stalled the whole boot process where either the mouse was in busy state forever, i couldn't access certain directories, programs would not start, etc. (believe me, i was furiously searching for the bug in my script until i planted some debug output and realized it was locking up due to not being able to access the site in the first place).
What i did, and what fixed *all* of my startup problems (except the occasional machine exceptions coming from the gfx board, was to add a failsafe to one of my scripts *right* after th above line.
/*
Checking for up and running network.
*/
v_network=0
v_network_retries=0
DO WHILE v_network=0
ADDRESS COMMAND 'ping -q -c 3 192.168.178.1 >T:boot_ping.log'
OPEN(pl,'T:boot_ping.log','R')
DO WHILE ~EOF(pl)
v_ping=READLN(pl)
IF INDEX(v_ping,'loss')>0 THEN DO
v_loss=COMPRESS(SUBWORD(v_ping,7,1),'%')
LEAVE
END
END
CLOSE(pl)
IF v_network_retries>2 THEN DO
ADDRESS COMMAND 'SAY ERROR: could not connect'
IF EXISTS('T:boot_ping.log') THEN
ADDRESS COMMAND 'delete quiet T:boot_ping.log'
EXIT 0
END
IF v_loss=0 THEN
v_network=1
ADDRESS COMMAND 'wait 1'
END
This checks and keeps the scripts from starting unless the internet is *really* accessable.
Time loss is 10 seconds at most, 4 seconds normally (but it's not feelable since the system is not yet up for usage after this short amount of time anyway). After three unsuccessful tries it stops, exits the scripts and gives control back to the system.
As i said, after planting this, i *never* had a WB boot stalling ever again.
...and i faintly remember i told that on the Hyperion forums now too...ah well
Yes, you are of course right, there isn't any check in the script per default.
Usually, you can just get away with the Wait I mentioned earlier, but of course it's still no guarantee.
Your solution is much more failsafe, but also rather complicated.
What I do, and what normally serves me well, is I have the following lines in my Network-Startup after the AddNetInterface:
---8<--- Wait 5 SECS
GetNetStatus CHECK INTERFACES,RESOLVER,DEFAULTROUTE If WARN Wait 3 ; Increase chances of IPrefs having opened the final WB screen RequestChoice "Network Startup" "Network not operational,*nS:Network-Startup aborted" "OK" Quit EndIf ---8<---
It's still not perfect; I remember Olaf saying something about GetNetStatus not always being able to determine for sure if things were up, but it's at least more concise and actually mostly works.
Maybe a matter of taste (and network conditions).
BTW, when you write about giving control back to the system, don't forget that Network-Startup is already being executed asynchronously, so in effect you're just exiting it, the system already has control and proceeds to boot WB while we start the network (which is why starting network-demanding programs from WBStartup is not such a great idea, as mentioned above).
BTW, when you write about giving control back to the system, don't forget that Network-Startup is already being executed asynchronously, so in effect you're just exiting it, the system already has control and proceeds to boot WB while we start the network (which is why starting network-demanding programs from WBStartup is not such a great idea, as mentioned above).
Oh, no, wait a second. That script i posted is part of any script that needs access, it's nothing that is used on it's own.
"Giving control back to the system" means, that i exit the script that fails to access the internet, if no network can be found or not internet access is possible (after the three tries and for whatever reason), which means the script in question can't stall the boot afterwards any further.
The script is not running after that, of course, but since i also keep a boot.log (which every script writes to during boot) i can see what went wrong after WB is up and fix the reason (instead of rebooting many times until i make it to the WB and then still don't know which script bombed)
See below for my log (sorry it's a real log, so hard to read, download and display in Multiviewer, i guess)
I thought it was something you ran from Network-Startup to check whether the network was up before continuing.
Yes, that was my first try to fix the behaviour, but it wasn't working. Actually all scripts that access the web need their own check for their own specific website to really rule out any possible hang.
Since it's easy enough to place it on top of every script it's worth it and working (for me at least)
This still happens from time to time and, as we know, it is easily solved by just doing a restart.
All I do in my startup-script is echo things to a file on the harddrive IFF that file does not exist. But, the last few times this has happened in the past month and I restarted the file was new, which means it was not created before, which means this is happening before the startup-script.
If liberty means anything at all, it means the right to tell people what they do not want to hear. George Orwell.