Page 1 of 5 123 ... LastLast
Results 1 to 10 of 48

Thread: Php Transpiler (per to x)

  1. #1
    Join Date
    Mar 2011
    Posts
    2,144
    Thanks
    59
    Thanked 116 Times in 113 Posts
    Blog Entries
    4

    Default Php Transpiler (per to x)

    I need to make a transpiler (want is probably a better word) to convert one language (one I create) to another language (a scripting language for a game (.per)).

    Here's a simple example of the actual code -
    Code:
    (defrule
    	(wood-amount > 100)
    	(unit-type-count-total militia-line < 10)
    	(can-train militia-line)
    =>
    	(train militia-line)
    )
    But in my one, it would look like this -
    Code:
    if(
    	wood.amount > 100;
    	militia-line.count() < 10;
    	can(train(militia-line));
    ){
    	train(militia-line);
    }
    How would you go about doing this?
    P.s. I know this would be better in a different language, but it's more of a proof of concept...

  2. #2
    Join Date
    Apr 2008
    Location
    So.Cal
    Posts
    3,643
    Thanks
    63
    Thanked 516 Times in 502 Posts
    Blog Entries
    5

    Default

    you'll have to break each language down into its component syntax and logical constructs, and map out directives/conditionals/variables/etc. in one language to their counterparts in the other. Depending on how similar the languages are/ how directly they correlate, this might be difficult.*

    *this is my entry in a competition for "understatement of the year."

    What's your motivation, here?

  3. #3
    Join Date
    Mar 2006
    Location
    Illinois, USA
    Posts
    12,164
    Thanks
    265
    Thanked 690 Times in 678 Posts

    Default

    Do you need to just convert things in that format? You could use string functions and/or basic regex.

    Or do you need something that can actually handle the whole language? You'd need a full parser that also can translate. It's not easy.

    Do you really want to take on this project at the moment?

    The real problem will come up when there is no linear-order similarity between the languages. If you're only using code for which you can relatively easily substitute chunks in the same sequence, that's just a complex version of a find and replace situation. But if you actually need to go into some sort of higher order logic to figure out which chunks should go where, that'll become much harder.



    I've never tried to do this for programming languages, but I've looked into it quite a bit for natural languages. In natural languages, this is more or less impossible because there are too many social/intuitive details that just can't be programmed into a computer, at least not yet. With programming languages you don't have that problem. But from what I've seen for natural languages even if you ignore that, it's incredibly complex. And there's no reason this would be simple, although it would be logically possible.
    Daniel - Freelance Web Design | <?php?> | <html>| español | Deutsch | italiano | português | català | un peu de français | some knowledge of several other languages: I can sometimes help translate here on DD | Linguistics Forum

  4. #4
    Join Date
    Mar 2011
    Posts
    2,144
    Thanks
    59
    Thanked 116 Times in 113 Posts
    Blog Entries
    4

    Default

    Thanks for the vote of confidence
    Yep, this is what I'm going to take on for the moment.
    The languages are very similar, just different formats... though there are some slightly more tricky bits as well...
    If you want, I can post bigger samples of both languages?

  5. #5
    Join Date
    Apr 2008
    Location
    So.Cal
    Posts
    3,643
    Thanks
    63
    Thanked 516 Times in 502 Posts
    Blog Entries
    5

    Default

    my vote of confidence was sincere - most people, I wouldn't even consider making a serious response. At most, I'd say "just choose one or the other - to make a translator, you have to be very capable with both languages anyway."

    just from looking at your first example, I can see a few spots where you might run into difficulty. for example, how will the parser know that wood-amount should be translated to wood.amount, while militia-line should remain militia-line? You're looking at an actual parser, not search-and-replace.

    Are these object-oriented languages (like js)?
    Last edited by traq; 10-20-2012 at 02:10 AM.

  6. #6
    Join Date
    Mar 2006
    Location
    Illinois, USA
    Posts
    12,164
    Thanks
    265
    Thanked 690 Times in 678 Posts

    Default

    1) Create a complete list of correspondences.
    2) Create a translation system that switches them while preserving all details.
    Easier said than done, but go for it if you'd like
    Daniel - Freelance Web Design | <?php?> | <html>| español | Deutsch | italiano | português | català | un peu de français | some knowledge of several other languages: I can sometimes help translate here on DD | Linguistics Forum

  7. #7
    Join Date
    May 2012
    Location
    Hitchhiking the Galaxy
    Posts
    1,013
    Thanks
    46
    Thanked 139 Times in 139 Posts
    Blog Entries
    1

    Default

    One of the main challenges I can see, is getting all the syntax right. After you've actually transpired the language, you could utilise Regex to check if the syntax is correct (first you'll need to figure out what is the correct syntax for particular lines) and if the syntax is incorrect, go through and debug it.
    "Most good programmers do programming not because they expect to get paid or get adulation by the public, but because it is fun to program." - Linus Torvalds
    Anime Views Forums
    Bernie

  8. #8
    Join Date
    Mar 2006
    Location
    Illinois, USA
    Posts
    12,164
    Thanks
    265
    Thanked 690 Times in 678 Posts

    Default

    Something will be lost in translation if you are guessing at the end what might be wrong.
    But I agree in principle-- there's a huge problem here in that if you need it to work perfectly, the hardest thing is the most important thing-- having it work for most syntax is going to be very limiting and buggy.

    If you're serious about this, you'll need to think about the big picture quite a bit before doing it in detail. You might actually find some of the writing on NLP (Natural Language Processing) to be relevant because translation is talked about a lot. It's impossible (at least at the moment) for natural languages, but those same ideas work well for programming languages because the syntax is more controlled and less irregular.

    One option to consider is creating a metalanguage that will be easy to translate out of and into, rather than attempting to directly map the two to each other, and that might help you in the end.
    Daniel - Freelance Web Design | <?php?> | <html>| español | Deutsch | italiano | português | català | un peu de français | some knowledge of several other languages: I can sometimes help translate here on DD | Linguistics Forum

  9. #9
    Join Date
    Mar 2011
    Posts
    2,144
    Thanks
    59
    Thanked 116 Times in 113 Posts
    Blog Entries
    4

    Default

    If it's not perfect, it won't work (whitespace isn't a problem though).

    @Traq: it was sarcasm; *this is my entry in a competition for "understatement of the year." isn't a vote of confidence

    As for your question Daniel, the difference between militia-line and wood-amount is that militia-line is one object while amount is a sub-object of wood... I'm not sure how to make it detect that though...

    Most of the actual data is said the same way across the languages. It's just the bit that does stuff to them is different (in most cases).
    It looks like line per line, I'll have to work out the purpose of that line, then pick the appropriate conversion method to convert it.

    The only thing I should mention, is that there are some bits that are completly different.
    The language per doesn't have any functions() but mine does....

  10. #10
    Join Date
    Mar 2006
    Location
    Illinois, USA
    Posts
    12,164
    Thanks
    265
    Thanked 690 Times in 678 Posts

    Default

    @Traq: it was sarcasm; *this is my entry in a competition for "understatement of the year." isn't a vote of confidence
    I fully agree with traq here, in a completely literal sense. This is hard. And perhaps we should underline it and put it in bold: hard. This isn't hard for you personally-- it's hard for anyone to do it. It's logically possible. But it involves not only writing a full compiler but also then being able to make it work with another language. I just can't, honestly, see you actually doing this. I'm not sure how complex these languages are (perhaps they're very simple, and they'd need to be very simple, and maybe then it's possible), but this is right between two of my fields of expertise-- programming and languages, and this sounds like a huge project to me.

    If it's not perfect, it won't work (whitespace isn't a problem though).
    Then my sincere advice is to not fool yourself into thinking that solving the basics with search and replace is the start and that the rest will come from there. You'll have to start with the hard problems and work this out at a theoretical level including the logic behind why the languages are as they are.

    As for your question Daniel, the difference between militia-line and wood-amount is that militia-line is one object while amount is a sub-object of wood... I'm not sure how to make it detect that though...
    That's a great example (it was traq's originally) of why this is very difficult. It's the small things that matter. 95% correct isn't correct at all, so it's that hard 5% that you need to worry about, not the easy 95%.
    And this is an example of a fundamental issue in language processing: computers don't have common sense. There's no way for it to just "know" that one thing is one type and another thing is another type. You can guess, sure, but the computer can't. There must be a completely formalized (eg, algorithmic) reason for it, or the computer will fail.


    There's a problem of infinity here that is a little hard to explain. These languages are recursive (right?) and they can in theory create new and complex phrases. Therefore your system must be able to handle lots of variation. That in itself is hard. But it's also the fact that, for example, symbols can be layered in several ways and you'll need to translate that. You have two tools available:
    1) Memorization (eg, telling the computer that x=y)
    2) Rules (eg, telling the computer that if there is an X, then if there is a Y, then to do.....)

    The one major advantage of this over NLP is that you don't actually have to deal with thousands of words necessarily and lots of variation-- programming languages do have finite lists of parts, usually short lists (in a relative sense) and you could go through and list everything you need. To quote Pānini (Sanskrit grammarian a few millennia ago), 'there are no exceptions, just more specific rules'. This is totally possible, just a lot of work.


    Anyway, if you do decide to do this, then I wish you luck, but I can't be much more help than that-- I know where I'd look for answers-- in theories, and wait a while to actually code anything. I don't know if you want to get that deep into it. But I can almost certainly guarantee that you will fail if you don't start at the deeper levels rather than just starting with the easy parts and hoping the rest starts to work out.

    It's sort of like building a house-- you can't start with just making the outside look pretty and sort of like a house looks-- you'll never end up at the point where you have the solid foundation with all of the crucial working parts in it.



    -----
    There is an alternative possibility: maybe this is easy. It's certainly possible that there could be two languages that are so similar that all it takes is a few search and replace operations and you'd be done. If that's the case, sorry for wasting the time getting sidetracked on other cases. But unless these languages are specifically similar in some way, then I expect that is not the case.
    Daniel - Freelance Web Design | <?php?> | <html>| español | Deutsch | italiano | português | català | un peu de français | some knowledge of several other languages: I can sometimes help translate here on DD | Linguistics Forum

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •