How are UUID and SHA generated


#1

How are the uuid and sha generated?
I've been searching in the code, but unable to find it.

(I'm making a little shell script to set the correct values, which can easily be added to GEdit as a post-save command.)

-- EDIT--
nevermind. I think I found it in https://github.com/axoloti/axoloti/blob/master/src/main/java/axoloti/object/AxoObject.java (456)


Start patch without gui?
#2

I'm getting different results when I try to re-create a uuid/sha
Could be because I know very little about Java (I'm coding this in Python).

Could someone explain the steps of how to get to the correct uuid for this simplified patch ?

<objdefs>
   <obj.normal 
         id="1" 
         uuid="a189a2fcb9f59572da5d5fe2054f1b58ef53ed3b" 
         sha="4958207f126fe3f5840fa86aa68ce62787c2a0d2">      
      <sDescription>a</sDescription>
      <author>b</author>
      <license>c</license>
      <helpPatch/>

      <inlets>
         <int32 name="1" />
      </inlets>
      <outlets/>
      <displays/>
      <params/>
      <attribs/>
   </obj.normal>
</objdefs>

I know it only uses the value of id , int32 and value of name , but in what way? this does not work (pseudocode) :

m = sha1();
m.update(id)
m.update('int32')
m.update(name)
uuid = m.hexdigest()


#3

Hi,

I have no experience in Java and little in coding in general, but, for the looks of the function, seems that it generates the UUID with a SHA function over the content of all the objects. Is that right?

As I infer from some other comments from @thetechnobear , although the UUID is generated with a deterministic function, the contents from the UUID value can be anything and changing them is not harmful for the object (except, I imagine, for referring them).

Said this:

Isn't it healthy then to standarize and reduce complexity using a standard RFC based UUID function?

@alex , Maybe you can just generate a new UUID with uuid.uuid4()

I was thinking on building an Axoloti patch repository and I was thinking on a "proper" way to identifying/fingerprinting them.


#4

this would be best discussed with @johannes as there is a 'mid term' plan to build a repo here, and the uuid and sha are all part of this.

simply put, the uuid is suppose to uniquely identify an object... as its name could possibly change.
the sha is about describing the 'interface', such that we can determine its compatible. (there is also an upgradeSha, I think)

together the concept is that we can deal with 'automatically' upgrading objects between version yet also be able to keep historical versions too.

its quite an interesting, and slightly complex area

use case:
we find a bug in the sawtooth oscillator, so we fix it in a new version...

now what we want to happen is this:
a user upgrades,
and when he loads a patch he is told the object has been upgraded, would he like the new version.
they say yes...

now whilst the new behaviour is 'correct', it has affected the sound*... such that his performance patches sound different with the new version... not great news if you have lots of patches, and don't want to go through them all adjusting them.

so... we need to be able be run existing patches 'unchanged', unless they have been upgraded.

(*) history is littered with conceptually incorrect filters/oscillators, that people like :smile:

anyway, thats the basic idea (as far as i understood it) so its a little more complex that just storing patches, which would tend to mean new versions would break existing patches.


#5

So, besides fingerprinting with a UUID, a 'version' / 'commit' tag might be necessary? Immutable UUID during the life of an object might me necessary to describe or follow its development.

Should we open a new discussion about this with @johannes ?


#6

@johannes monitors these threads, and this is as good a thread as any.

the uuid is immutable.

commit tag... thats specific to how its stored (e.g. version control system), the sha is independent, we just need to make the tools more generic to stamp the objects correctly.
currently all the 'factory' objects are generated from java, rather than be hand coded xml, hence they
have no issue regenerating the necessary sha.

iirc, the general idea is that on upgrade, objects that have changed would be moved to a 'archive' folder on the users machines, with a file name with a combination of uuid/sha (to allow different versions). so when a patch is opened if the 'current' sha is different from that associated with the patch (the patch records uuid and sha of objects used) then it can search the archive for the original object.

in this way, a patch can either be upgraded to the new version of an object, or load the original version from the archive.
(bare in mind, any particular object may be upgraded many times, and so 3 different patches could need to use 3 different versions)

it should be note, this is as I remember be discussing with @johannes, so may be incomplete, and also was still subject to change/discussion when we do the 'upgrade' process. but the important things was to get the uuid/sha into patches before release, so that we have a mechanism for upgrades :smile:


#7

:smile: Yeah! I think the 'version' tag would be a nice thing to have so anybody can indentify the chronology of certain plugin intuitively.

The sysadmin in me asks for a declaration of requirements or a dependency list defined on the main patch, as the 'requirements.txt' file for pip on python projects. Something like:

osc_sine = 1.3
osc_saw = latest
osc_supersaw = latest

Then, if your Axoloti distribution doesn't have those objects (or doesn't mathc in version), Axoloti can download the exact version required from the central axoloti repository, which is just a regular static http server (same as pip or apt).

Even better: the axoloti community maybe isn't huge (as there are 'only' 600 and something boards around), but some people might want to mantain a repository under their control: maybe somebody wants to open a repository with his/her patches on a web, os somebody builds a crazy community site where everybody can upload patches at will. Then, it can be something like:

[axoloti]
osc_sine = 1.3
osc_saw = latest
osc_supersaw = latest
[axo.andorfakewebpage.com]
megalead = 0.5
crazycoconut = 1.2

Possible problem: If you edit the subpatch to modify something, modifications might be overwritten by the next update.

I'm just thinkg loud :smiley:


#8

ok, I need to correct my comments about the ids.

uuid - is an sha of the interface (name,inlets,outlets etc)
sha - is an sha of the implementation
upgradeSha - is all the previous sha that the object i.e. how it has previously been known.

versioning, yeah I get the idea, but maintaining version numbers is error proned (people forget, especially on small objects). also its only the 'programmers word' that says 1.2.2is compatible with 1.2.1, and 1.3 has some differences.
and maintaining a package definition (pip file) are also a pain. i guess it could be hidden to some extent, i.e. a patch says uses latest, and then you could freeze parts.
but again, this would need to be tied to some kind of unique naming, as potentially, if your using sub patches (from elsewhere in the user library), they could use different versions.

I guess the issue is, we as programmers, are all too familiar, with the complications of versioning, and mismatches... and have lots of tools to manage this (some even work sometimes!) but frankly this is (imho) a small part of this project... where resources are pretty scarce, so it kind of 'worries' me when we start talking about complex schemes... but hey perhaps I'm a pessimist :smile:


#9

Yep, probably maybe a package definition file isn't even necessary, Axoloti can just fetch the object list from the XML and, if not version given, fetch the latest.

But my mind is running around the idea of hosting alternative repositories. Mantaining a repository with external collaborations can be quite a PITA, and maybe only some of the objects are worthy (ahem :innocent:) of stealing valuable time of the main developers hands :smiley:


#10

yeah, I think the point is the patch stores the uuid and sha... so not only does it know the interface but also uniquely identifies an implementation.
so I think the intended process is (@johannes will have to confirm, as its a bit of time since we discussed this) ...

when you start axoloti, you load all objects on your search path, each object not only has the uuid and current sha BUT also the past sha(s) in upgradeSha.

SO... when you load patch, the each object used is referenced thru the sha... if its the latest its immediately found, BUT if not, the upgradeSHA are searched.
at this point there are two options:
i) load the newer version of the object.
ii) go find the original object... which will be identified via the upgradeSha.

(it can also check the uuids of the two objects have not changed i.e. the interface)

I think the original idea was simply, when upgrading we would copy all the existing objects to an 'archive' folder, with a filename of the sha.

but a remote implementation would be better... otherwise a new user may not be able to load a patch from another user, because he uses objects from the past.

you can see this kind of avoids numeric versioning, more going for an interface and implementation tagging. of course the 'downside', at least as i see it, is there is no way to say an object is backwards compatible
e.g. adding a new 'optional' inlet, will change the uuid, in the exactly the same way as removing an outlet.... one would break a patch, the other would not.

I'll repeat, its been a whilst since I last talked with Johannes about this, so I may forgotten/confused a few details, and Johannes may have refined his ideas too.


#11

So the idea behind uuid/sha hashes was to implement a central online database (well this could be just a regular filesystem, and it could be versioned with git, but only adding files, never changing or removing) that contains all implementations of all objects ever created, so all dependencies can be resolved to the implementation ("sha") at the time of patch creation, and eventually can also resolve (by "uuid" hash) to bugfixed objects.

The "upgradeSha" field was probably not a good idea, it was a quick and dirty hack to upgrade existing patches when updating the "sha" hashes of all objects during development.

I think, rather than having a backwards reference in an object, it is better to have "upgrade" object files, that match sha/uuid and make a forward reference to the newer replacement object, and also contains a human-readable explanation of what has changed, and what impact to expect from upgrading.

When creating a patch it is hard to tell if it 'd benefit from an updated object, or you may want to run exactly as when created. This is also not a general decision for all objects when loading a patch, but needs to decided object-by-object.

It is a challenging quest to balance between backwards bug-by-bug compatibility and pushing forward improvements, without creating a dense forest of objects. I really appreciate all your input here, also contributions to an implementation would be very welcome.

For the reference - there was some discussion about this topic here: https://github.com/axoloti/axoloti
/issues/129


#12

Suddenly a flashback from Freenet "future links" came to me :smiley:

I think, within a 'namespace' (let's say, osc/sine ), implementation should be stable, at least on minor versions: backwards compatible controls and in/outlets, and if additional ones are added, the object should work the same as before if those newer features have no input.

And if somebody wants to make significant functionality changes, he/she should fork then ( ~andor/osc/sine ).

Long thread! It's bit late here, so I'll take a look tomorrow. I'll also check how apt and pip repositories manage internally version chronology and dependencies and try to think of a simple inplementation to do version fetching. Simple is better!


#13

aren't we in some ways 'reinventing' the wheel here... surely if you want versioning, we should just use a version control system (even if its hidden from the user)

(ok this is flawed, but its just an idea, to show what I mean)

what if all objects were stored in a local git repo, pulled from a central source. then a patch stores the object name and a commit id?

now when loading a patch we can see if the current commit id has changed, and we can also pull an old object using the commit id stored in patch.

ok, we have to hide git, so use Jgit (or similar) to create options for the users BUT there is huge scope over time... e.g. the object has changed, we let the user see the versions with the checkin descriptions, who did it, when etc... potentially even showing them the differences...

(note: if you have multiple repos, then you will need to store repo + commit id , as they are not unique across)

also this allows for a bit of 'advanced stuff' including running from a forked repo, which would can then issue pull requests for.

(git is an example as we are using currently, but any version control system does this kind of stuff, and many have open apis now)


#14

I'm thinking about it because, If you pull the whole repo, you need to be travelling back and forth around commits to find the correct version for each one.

But instead of that, sparse checkout might get handy. Sparse checkout is like a partial checkout: just for a portion of the tree of the repository. So you can do several sparse checkouts, one for each object in the project, and navigate back and forth separately on each object if needed.

This is funny because just 5 minutes ago I was thinking sparse checkout was a PITA and I would never use it.

Anyway, a git repository always might hold one problem: user edits one parameter on the original subpatch, and the repository won't allow updating unless you do a git stash or a git reset, and things might get funny.

Anyway, my option is either that, or packaging each commit for each object, statically, in a zip, in a plain webserver, so if you want certain version of an object, you just:

wget http://axo.andorfictionalsite.com/objects/osc/sine/ah34561.zip

And done.

For packages I'm always a fan of static webpages, so I'm still a bit more inclined to the second option :wink:


#15

not convinced...
you won't be travelling back and forth via commits... as most of the time, users will be dealing with latest versions, and not particularly old versions.

sparse checkout, id add if/when needed... don't overcomplicate from the start... my experience is repos are not particularly large, and I've never had an issue, I cannot see the axo objects are particularly large as they are text objects so are compressed.

merge/stash, this needs to be considered (which is why is set my idea has flaws), essentially my idea was that users would NOT edit the repo directly, more there changes would be submitted by the app into the repo SO it could for changes, do syncs beforehand etc to prevent this.
(not as simple as that but solvable i think)

I know the tendency is to think this is 'overkill' we only need something simple, but Ive seen this so many times on projects... a team hack together a simple solution, which overtime gets more involved and complex... (I've seen it with files vs databases, version control and N other things) and it ends in a dogs breakfast and a maintenance nightmare.
and I can see it here already....

a file solution, will not provide any history, or authoring, or dates, comments, or checkpoints.... how long before someone suggests as well as the files we need metadata (so we can see comments, or the date of change, or the author), or wants to be able to access a file from 3rd January... how long before someone says its too big, we need to store only differences, or compress it?
I don't think any of these are unreasonable requests... and are probably likely, once the library has been around for a while..... before long you have build a version control system.

its the same with packaging and interdependencies, first people want zip files, then they want dependancy management, package retrieval, archiving.

doing this, means before you know it yours spending weeks of development time on it, which we just don't have... and there are many more interesting things to get on with.

Im not saying we could not build a custom solution, we could , but frankly, as an open source developer Im not interested in build a VCS/Packaging system (if I were, Id contribute to a VCS/packaging project)... Im interested in building a music related product, this is what I think Axolotl excels at.. we all want a user library, but this is only 3% of the requirement, so I don't want to spend 80% dev time on it.... a few days, not a few weeks/months for me.

BUT this is only my opinion, if it interests you/johannes thats cool :smile:

one point on 'requirements', please remember many musicians will have axoloti UI on a machine in a studio that is NOT connected to the internet (also if they are 'away' they might not have an easy way to connect), so we should minimise the amount of online retrieval... which means having local caches.


#16

So...

What about this?

Including the objects git repo with the axoloti software distribution, so the user has objects versions at least as new as the release they are downloading, and then, when opening a project, copying needed object versions to the working directory of the project, either the same the user was using, or newer if the user wants to upgrade.

I wouldn't overwrite wathever objects the user already have there, I would use a separate directory for git pulled objects, something like this:

mysynth
├── mysynth.axp
├── coolsine.axp
├── supersaw.axp
└── dependencies
    ├── sine.axp
    ├── sine.axh
    ├── mux.axp
    ├── mux.axh
    └── versions.axd

But would be awesome if the xml definitions files have enough information for an external dependency system to work. I'm not thinking to write a full fledged dependency system, but reusing some simple package manager already existing and store packages in plain directories :smiley:


#17

obviously we need @johannes's input on this, but I know ihis very busy prepping for the makers fair in rome (as he won it :tada: )

hmm, copy files is always a pain, as it means you end up with different versions in different projects, becomes a mess really quickly. I think the way to handle this would be to do what Ableton does, and have a 'collect' option, which basically copies all dependancies into the project.... and its a one off hit (afaik)

Im not quite sure I understand why you want the stuff packaged / dependancies etc.
The only purpose I can see for this is distributing entire patches to other users...
this has 2 use cases I see:

  • demo/songs, and their sub-patches
  • components that required other components.

the former, we haven't seen much of so far.... its possible to use he patch/patcher object to avoid having subpatches as separate files... ok, there are other dependancies e.g. wav files, but really do we have a strong use case for this?
the later, mostly sub-patches have used factory objects, which are distributed (at the moment) with the release, so this is a dependancy, but they don't have interdependancy to other sub-patches, although it is possibly they could, but again we are some way of that now.

as for multiple repos... well really the point is we want one central one, we don't really want people having to got to many sources... but i will grant, I think it may be at some point moving the factory objects to its own repo might be good, as these may get updated more frequently than a software release.

Id also get concerned that a packaging system is almost contradictory to a fine grained object library.... or at the very least a completely different concern.

as it is today, I think the more pressing problem is sharing objects/sub-patches.... but again thats just my opinion.

(there is an argument that some users will want complete solutions e.g. a reverb pedal , but for that one could argue that sharing the binary is a better idea.. then the only dependency you have is the firmware version on the axo board)


#18

I don't have Ableton around, I should check that, but, I think we're now around a similar idea.

About yours: if you have a proyect, 'collect' all objects, which, lets say, are version 1.1, and a month later, you create a new project, collect all objects, which are now v1.2... Don't you have now different versions all around on your different proyects?

About my (parallel) idea:

When I say "dependencies" I mean, patch dependencies (my patch needs this, this and this object), not complex dependencies (my patch needs this object, and that object are dependent on this other 3 objects and so on).

My idea is creating a way for people to share objects or groups of objects easily, not depending on having access to your repository. Mantaining 10 or 20 different commiters on an object repository can be feasible, but, if people begins to share lots and lots of patches, you're going to expend more time giving repo access or aproving patches than developing axoloti.

So, I'm thinking on an easy way of putting online user created objects or patches collections.

If you are (have been) an Ubuntu user, think about very simple "PPAs" but for objects or patches.

I'm with you on the last "complete solutions" thing. Unless we want to become a similar-in-a-way-but-not-the-same-at-all-spirit thing like Mod DUO.


#19

I think I've mentioned this before:
it would be easy to 'zip' multiple files into one and give it a different extension (like .axc , for 'collection'). It would only have to include the modules that are not factory objects. Opening such a file would simply unpack it in /tmp/folder/ , and axoloti should also look for its modules there (temporary) .
This works for modules, however sub-patches could be a bit more complex.

(this is somewhat similar to OpenDocument Format , which is basically a zip containing a few xml files , and renamed to .odf)


#20

Could we structure the pro's/cons of different approaches in a shared document or something?
It is important to choose the right direction.
I agree that developing an entire packaging/VCS is beyond the scope of this project, but I am not convinced that the needs are very similar.

I think the key need is de-duplication, and promote migration to improved revisions so old bugs can die.