@Sinan With removing -gai you link against those functions in the newlib => issues.
And nope, it didn't works with local gaming as well (because of that threaded network issues, for details check latest 5-7 posts about)
As for Mod Error Mod Security, that can be disabled in config.
But for real until we didn't deal with those network issues, all other issues can wait, as if we didn't deal with network threading code it all make no sense then :(
@sinisrus Everything stopped at the moment with problems with the network. I do not know if they are endian related or not, or (which is very possible too) roadshow related, but I fear if it roadshow related, fix is near to impossible.
I will definitely give another go for, as there was a lot of time spent to, but when dunno. Not right now, as have other projects to finish first which looks more possible to finish.
I also may try to come from another side, and find out something who can work on it on a payment basis (someone from minetest team), but that is just an idea.
@All I want to resurrect this thing again after Andrea spend lots of time on clib2, in hope to see if it will behave anyhow different in compare with newlib.
But then, i meet with some strange problem which can be minetest itself of new clib2, dunno.
Issue is: once i run game, thins inititializes, window creates, etc, and minetest start to ask for use local inbuild (and a bit changed) lua lib functions, it crashes.
It didn't crash for real like "GR" or so, with all that registers stuff, but game itself throw that:
ERROR [Main]: /minetst_clib2/src/script/cpp_api/s_base.cpp:52 ScriptApiBase:ScriptApiBase(ScriptingType): A Fatal error occured: liaL_newstate() failed.
And tis s_base.cpp: 52 is:
class ModNameStorer
{
private:
lua_State *L;
public:
ModNameStorer(lua_State *L_, const std::string &mod_name):
L(L_)
{
// Store current mod name in registry
lua_pushstring(L, mod_name.c_str());
lua_rawseti(L, LUA_REGISTRYINDEX, CUSTOM_RIDX_CURRENT_MOD_NAME);
}
~ModNameStorer()
{
// Clear current mod name from registry
lua_pushnil(L);
lua_rawseti(L, LUA_REGISTRYINDEX, CUSTOM_RIDX_CURRENT_MOD_NAME);
}
};
Line 52 is that "lua_State *L;" in private part.
With Newlib as i say all works fine, but it can be of course issue and with game, and with new clib2, but need to find out why it crashes at all there and why it didn't work.
I recall that i have this kind of issue somewhere someday, but can't remember what it was. But it seems something brutal, like non initialized base of library or so, while, of course it should, as it's not amiga libraries, but simple .a lib statically linked with..
Any ideas and help apprecated !
After it will start to works, we can compare if it the same over clib2 bugged when create a server, or not. If yes, then continue to debug, if not, then good for us.
Line number seems incorrect. If you look where the error message comes from, it should be in a different place. It should also be possible to debug the failing lua call, at least using printf.
Is this lua library code also compiled using the same C library?
@all Ok, pure build of latest lua and their 2 binaries "lua" and "luac" behave also weird with new clib2, while ok with newlib. So or there bug in clib2 , or it matter of different "ifdefs" and some other code of lua compiles , which isn't for newlib.
Now i was able to build minetest over clib2, and , of course it the same crashes as with newlib when i tried to create a game (so it tried to create a server), and crash exactly of the same kind : heavy freeze of the OS, most of time without crashlog, or with some "crazy" crashlog.
I of course first think about network code being wrong, but, fetching list of the servers works, they lists fine, etc. So at least network basically works. But we crash when start a game and it about to create/connect to the server.
Asked on mintest forum if game is endian aware, and they say that "In general the code is written to be endian independent, but probably nobody has verified this to work in years.". Not very promising, but at least they know what Endians is, and that they have no special little-endian parts.
So, have anybody any idea what next? Functions really didn't looks that heavy , they should work, the only difference from usual usage is threading and mutexes.
I am curious what is in the sent package indeed and what is transferred by roadshow. Is this input properly serialised ? But... I am sure you tried to check it.
Firstly, minetest have set of "tests" , running via "minetest --run-unittests". it testing there threading, sockets, connectsion, serialization, etc, etc.
So i run it, and it simple freeze too after some tests passed, and this freeze is random (can happens early after threading/socket checks, can happens a bit later). I then remove test by test to find out the guilty one, and this is : test_connection.cpp, there is:
@kas1e are you able to printf content ( partially at least ) of this putCommand ? Is anything in it at all ? I am still afraid it is something around serialization or syntax. to gummy duck: You are able to get server info... but not able to put server id... It means server cant recognize input... or server has recognized but your app didn't connect properly... you should be able to printf what server gets
Just for sake of to be sure i added prinfs/delays before and after this m_send_sleep_semaphore.post(); call, and yeah, can confirm that one where things freezes.
"m_send_sleep_semaphore" is "Semaphore m_send_sleep_semaphore;" (from connectionthreads.h).
And "Semaphore" is:
class Semaphore
{
public:
Semaphore(int val = 0);
~Semaphore();
DISABLE_CLASS_COPY(Semaphore);
void post(unsigned int num = 1);
void wait();
bool wait(unsigned int time_ms);
void Semaphore::post(unsigned int num)
{
assert(num > 0);
#ifdef _WIN32
ReleaseSemaphore(semaphore, num, NULL);
#else
for (unsigned i = 0; i < num; i++) {
int ret = sem_post(&semaphore);
assert(!ret);
UNUSED(ret);
}
#endif
}
Now .. go figure what wrong with semaphores and why they freezes.
But what i can see there, that it take "int" as input, and then signals semaphores, but we do call it from Trigger() without any value, just as "post()" , so maybe something wrong there ..
Edited by kas1e on 2022/5/10 19:34:30 Edited by kas1e on 2022/5/10 19:41:38 Edited by kas1e on 2022/5/10 19:47:16 Edited by kas1e on 2022/5/10 19:54:16
But what i can see there, that it take "int" as input, and then signals semaphores, but we do call it from Trigger() without any value, just as "post()" , so maybe something wrong there ..
num is given a default value of 1 in the declaration of Semaphore:
void post(unsigned int num = 1);
So if you call post() without a parameter it should use the default. You could add a printf to confirm that.
The problem is probably in the call to sem_post(&semaphore);.
But it do cal .post one time, i.e. one time with "num 4".
But in our case when we test network send/recv, we do call .post one time with num 1 (yeah defaul), and it pass fine, then, we do call this post second time, again with num = 1 as default, and this time we freezes on "sem_post(&semaphore).
I do not know why it call .post second time (and from where), but i can see, that Semaphore::~Semaphore() is not called after first call, that for sure.
"sem_post" in our semaphore's implementation are : int sem_post(sem_t *sem);
Maybe issue when we calling .post few times and not one time ? But then the only difference is calling of function, which do almost nothing but sem_post ..
Maybe it may wort to wrote new "unit_test" for minetest, where experiment with different .post usage, so by this way we can see if something with our semaphores implementation or it only happens when it used with network.
The code for sem_post() isn't terribly complicated. Since the same code seems to work most of the time (and presumably in other projects as well), I'm guessing that's not where the problem lies. So two possibilities come to mind.
1. The semaphore (sem_t) that's being passed to sem_post() is getting clobbered by something. sem_post() makes a number of Exec calls, passing pointers obtained from the semaphore. If one or more of those pointers is invalid, it could potentially cause a crash.
2. Presumably there's another thread somewhere that's sem_wait()ing on the semaphore. That thread wakes up when sem_post() posts to the semaphore, and that thread does something that causes the crash. The crash is not caused by the post, it's just triggered by it.
I never use anywhere posix semaphores, it is first time i meet with them in minetest. Frederik made this library exactly when i ask about in this thread and then fixing it few times.
Very simple test cases passes, yeah, but recv/send one cause a hardcore freeze, without crashlog, without anything on serial, mean that something really heavy happens.
Quote:
So two possibilities come to mind.
Question is how to find it without debugger, what to prinfs and where.