Sequencer core ============== Draft: draft 0.02, 6 May 1998 Authors: Frank van de Pol P.J.Leonard Jaroslav Kysela Current Situation ----------------- The currently available sequencer interfacer for Linux is the /dev/sequencer and /dev/music from the OSS. Though these interfaces are sufficiently usefull for most sequencer applications they have a few shortcomings: - Only one application at a time can have access - Because of non realtime character of a time-shared system like Linux the driver offers a queue in the kernel which is needed to prevent events to be scheduled too late. This queue introduces big latency in event processing. - It's one big monotlithic driver. Especially the 2nd issue restricts building midi oriented applications that can perform on-par with applications on Apple Macinctoshes and Atari ST's regarding real-time response. Examples: 1) If one wants to have a sequencer perform a MIDI THRU function, it will suffer form big delays, and using the OUT_OFF_BAND ioctl() results in hanging notes because the driver has no clue what events are playing. 2) Events have to be enqueue ahead. Parameter changes (eg. volume control, muting tracks or intruments) will not be effective instantly. New Sequencer ------------- To overcome these disadvantages I'd like to propose a new architecture for scheduling and dispatching MIDI and MIDI oriented events within the Linux sound driver. Note that this is still 'Paper ware', and has yet to be developed. Some of the idea's I'll present are inspired by the MidiShare "MIDI operating system" (http://www.grame.fr/english/MidiShare.html), which exists for Apple mac's. This new sequencer is intended as a replacement for /dev/music. Hightlights: * Multi independed event queues support * Supports multiple concurrent clients, both in userland and in kernel space * kernel space clients can be loaded as modules * sequencer takes care of dispatching of events, any client can send messages to any other client (or clients) * superfast event routing in case of kernel module clients * it's only a sequencer framework, all I/O (eg. MIDI, synth) has to be performed by clients. There's nothing but clients * processing of high-level MIDI oriented messages: note on at timestamp xxx, control change at timestamp xxx Ascii graphic of architecture (one queue): ========================= || || || \||/ \ || +--------+ +--\/------------+ | || | | | | | || | Timing | -->- | Priority Queue | | || | | | | | || +--------+ +--------||------+ | || || \ Sequencer || \||/ / Core || +--------+ +------------\/--------+ | || | Client | | | | /\ | Manager| | Event Router | | /||\ | | | | | || +--------+ +---||-----------||----+ | || \||/ \||/ / || +--\/--+ +--\/--+ || |Client| |Client| || | 1 | ... | n | || +--||--+ +--||--+ || || || || \||/ \||/ || \/ \/ =================================== All events flow through the queue. Because a priority queue is used instead of a simple FIFO, it's no problem to accept events from multiple clients. (At long as the queue is not full anyway, a well behaved application should send only as much events that are needed to achieve tight playback (eg. 1 second), and not to try keep the queue overflowing.) Clients can either be user-land applications, accessing /dev/seq or kernel modules that can directly submit events, and are directly called when a event it dispatched to them (even from interrupt mode). Timings should be done with time separators which should contains time between tick in us (between event blocks) - relative or absolute timestamp. This will allow to application do some special things like effects inside MIDI tick etc.. Sequencer shouldn't know current tempo and timebase, but it should support some other add-on synchronization method SMPTE, FSK etc. User-land clients ----------------- The applications that are seen as user-land clients just open a /dev/seq device (or something similar), and read()/write() data from/to it, just like they used to do with the /dev/music device. For registering and interrogation the sequencer an ioctl() interface is used. Interface to multiple clients should be provided with multiple open of /dev/sndseq file. Open syscall opens only sequencer which isn't connected to any event queue / timer / kernel clients. Ioctl calls will provide these connections. Resources (event queue, clients) should be identified with some key (probably string) which allows to application specify exact resource which want create / connect. Security: Sequencer should match UID/GID for current user and requested resource. This doesn't allow for example corrupt event queue - from any normal user - which opens root user. The presense of multiple user-land applications allows for some nice features: - Bank managers, synth editors can be run parralel with sequencers - recording source for a sequencer does not need to be a MIDI input port, but can also be the output of some other client, like for instance a bank manager. All sysex to setup a midi device can so be recorded in the sequencer. - A single application does not need to provide all the funcionality one can think of. Why put a GM/XG mixer application within Cubase if one can use an externel application, and let these interact. If one needs a 'meter bridge' that shows the levels for all the MIDI channels in the system, but the sequencer doesn't provide such a thing, a second 'meter bridge only' client can be used! Similar for mixers, bank managers (download samples to wavetable synth (GUS!) on reception of program changes). Kernel mode clients ------------------- The kernel mode clients reside in kernel modules. These can be stand-alone loadable kernel modules, or modules that offer other functionality (eg. MIDI driver, soundcard driver). The module level interface exists of following: - client can register itself, provide information about itself and it's capabilities to the system by calling a function exported by the sequencer. - client registers a call-back function that will be called when a event is to be dispatched to the client. This can be called from an interrupt handler. All the call-back function addresses are stored in some sort of a jump table. - the client informs sequencer what broadcasted messages it's interested in. - the client can call an (exported) function within the sequencer to enqueue a message. - unregistering also goes by calling unregister function. Because the events can be dispatched to kernel mode client immediately, this offers possibilities to provide good midi thru and filtering functions. Example: Midi input driver receives midi bytes (interrupt driven), once a complete midi message is received, it is send to the sequencer. This event (which has current timestamp) goes directly to the clients that requested reception of the note events. If there is a kernel mode midi thru module, it is directly called, the midi event is perhaps transformed (swapped channel/port), and send back to the sequencer, where it's directly dispatched to the midi (or synth) client, which plays the event. Nice low-latency midi patch-bay! Because midi runs at 31.250 kps, it will take 0.96 ms to receive a note on message (without running state). Sending this event will also take 0.96 ms, so the event arrives 1.92 ms later at the playback device, which is faster than the typical delay within syntheseisers before a sound is produced. Other applications (apart from drivers) could be support modules for high-end sequencers, that need fine-graded real-time control. Client communication -------------------- The event passing mechanism is well suited for real-time controls, note events etc. But to access very specific functions of a device (client) like for instance downloading samples to a sample player, or changing the microcode for a DSP or onboard processor a different interface will have to be provided. Make this a special 'for the device or application', or use the sequencer as a multiplexer to pass the data to the driver. MIDI ports ---------- Note that there is no such thing as a MIDI port, synth device or mixer in this picture. The idea behind this is that a driver for a midi port should be implemented as a 'kernel module'. This 'MIDI Port' module can of course be part of a low-level midi port, and register is from there. There is no need to have both a MIDI lowlevel module and a MIDI sequencer interface module. To prevent the missery of stuck notes a low-level midi driver should keep an image of which notes are active. In case a note_off message is missed (which is an application error!) the driver can shut the notes when asked. The precense of a priority queue also offers the opportunitiy to process NOTE events what come with an length. On receiption of such an event (by eg. a midi driver client), the note can be started and the corresponding note off can be enqueued. Using this facility the chance of hanging notes because of abrubt abortion of a midi player will be reduced to 0. OSS Compatibility ----------------- Because allmost every MIDI application that exists for unix makes use of the OSS /dev/sequencer or /dev/music interface, it's a must to provide backwards compatibility for these applications. Users then can use the new sequencer engine, while keeping their old applications. The application writers then can migrate to the new sequencer and make use of improved capabilities. To achieve the compatibilty a wrapper for /dev/sequencer and /dev/music devices has to be implemented (as a loadable module). This wrapper can simply map the OSS events to the corresponding sequencer calls. Using OSS as the workhorse -------------------------- At the time of writing for this document, the only currently available midi and syth driver is the OSS. If a client task is developed that presents itself to the sequencer core as a bunch of input and output device, and simply does read/writes to the OSS /dev/music interface (perhaps directly call the exported functions); a sequencer can be developed and used while using the OSS as workhorse / low-level driver. Client Registration ------------------- To get an application to know what other clients have registered, a few functions have to be provieded to interrogate what clients are present, and what capabilities they have. eg.: Client 1 Name: GUS MIDI Port Capabilities: MIDI input, MIDI output Client 2 Name: GUS GF1 Capabilities: MIDI output, SYNTH output Client 3 Name: Timidity (Soft Synth) Capabilites: MIDI output, Disconnect Client 4 Name: XG Editor Capabilities: MIDI input, Disconnect Client 5 Name: Instrument Manager Capabilities: MIDI input, Disconnect Event Structure --------------- All the events have (apart) from their specific content a few common fields - timestamp (in midi ticks), like oss /dev/music - message type/id (eg. NOTE_ON, CHANGE_TEMPO,...) - source, from which client message comes - destination, to which client(s) is the message to be send. An event can be send to either: a) a specific client, in this case the client number has to be given. b) all clients that have registered for this (class of) event this is basicly a broadcast. This destination also can have (for eg. note events) a port and channel. Events a modeled after MIDI, but are not restricted to be MIDI events. Any event that one can think of can be fed into the sequencer and dispatched at the specified time to the specified device(s). Apart from this addressing scheme, a client can also request to get every message, even one that are meant for other clients. This promiscuous-mode allows a device to snoop all data. Apart for the obious events like MIDI note on/off, control change etc. Some other events can be thought of: - Announcement that new client has registered - Announcement that client as unregistered (is gone) - Change of capabilities for a client messages (?) - Other 'change of state' messages <(removed) - Change tempo, timer resolution> - trigger /dev/dsp devices for instantanious sample starting - wake-up, for creating periodic tasks (when rescheduled) within the kernel drivers. Difference between /dev/sequencer and /dev/music ------------------------------------------------ For some (historical?) reason OSS provides two different sequencer interfaces, /dev/sequencer (the old one), and /dev/sequencer2, also known as /dev/music. Is there a good reason why a new sequencer core also should provide 2 interfaces? What's exactly the difference between these interfaces. For backwards compatiblity I understand why these two should be implemented, but is there any reason why the functionalty cannot be provided by one (good) sequencer? If the only real issue is that the old interface gives lower-level access to a synth device, and such can't be achieved by a simple interface wrapper (eg. access to every single voice in the GUS for playing MOD files), it could be an idea to provide such synths with a CAP_LOWLEVEL_SYNTH, or CAP_LOWLEVEL_AWE32 capability flag. (And perhaps a message to switch from one mode to another. Synchronisation --------------- One of the points the currently available sequencer solution lacks is synchronisation. Synchronisation can be used in a few places: - Normal master clock is the system timer (typically 10ms). - To get higher resolution, any timer within the system can be used. Most soundcards have a timer onboard that is capable of generating interrupts. This can be used as master clock for the sequencer. - Audio playback can also act as a timesource for synchronisation. By using a counter of the number of samples played for syncing the master clock, it's a good starting point to get MIDI in sync with digital audio. - MIDI clock can also be used as time source for the sequencer. Or one can decide to use MTC. These two protocols can be received and used to adjust the internal clock, or even simpler can be transmitted. - Some cards have a special synchronisation port (SMPTE code, FSK or something similar). This port can also play it's role in the synchronisation game. MIDI emulation -------------- Software should know which standard is using (GM, GS, MT-32, XG). Emulation should be set/get from/to application and driver should offer only some emulation for internal synthesizer. If there is a need to, we can even add a parameter to set the minimum and maximum number of voices to be allocated for a specific channel (like the Partial Reserve feature offered by Roland Syntheseisers). Set the max. to 1 to get a monophonic channel, set it to 0 to essentially mute the channel. The Drum/Percussive channel can default to 10 (GM compatible), but should be user configurable. The user should be able to set it to any channel he or she want, and even switch it off. Conforming GS/XG standards, multiple (eg. 3) drum maps can be provided. For changing the voice allocation and drum parameters we can simple use NRPNs or sysex!!! ============================================================================ Event in detail: ---------------- Maximum number of event queues - 32. Each event will have fixed size 32 bytes. Maximum number of available clients in system is 127 (1-127). Variables word/dword are system specific (little or big endian). #define SND_SEQ_EVENT_SYSTEM 0x0000 /* system event */ #define SND_SEQ_EVENT_TIMER 0x0001 /* timer event */ #define SND_SEQ_EVENT_SYNTH 0x0002 /* synthesizer event */ #define SND_SEQ_EVENT_MIDI 0x0003 /* MIDI v1.0 event */ #define SND_SEQ_EVENT_SYSEX 0x0004 /* MIDI v1.0 SysEx event */ struct { /* --- event header --- */ unsigned short type: 10, /* event type */ queue: 5, /* event queue identification */ realtime: 1; /* event is realtime */ unsigned short subtype; /* event subtype */ unsigned char src; /* source client */ unsigned char dest; /* destonation client */ union { unsigned int tick; /* 0 - 2^32 */ struct { unsigned int sec; /* seconds */ unsigned short usec10; /* microseconds / 10 */ } t; /* maybe add MTC? */ } timestamp; /* --- event header --- (12 bytes long) */ union { struct snd_seq_event_system system; struct snd_seq_event_timer timer; struct snd_seq_event_synth synth; struct snd_seq_event_midi midi; struct snd_seq_event_sysex sysex; unsigned char data[20]; } data; } snd_seq_event; snd_seq_event -> src - identifies client which sends the event - zero is reserved for system purposes if client can't be identified - should be used mainly for System events - values 128-255 are reserved for future snd_seq_event -> dest - identifies client(s) which receive(s) event - zero means broadcast event - values 128-255 should be used for multicast groups #define SND_SEQ_EVENT_SYS_CONNECT 0x0000 #define SND_SEQ_EVENT_SYS_DISCONNECT 0x0001 struct { union { unsigned short client; /* client ID */ unsigned char data[18]; } data; } snd_seq_event_system; #define SND_SEQ_EVENT_TIMER_START 0x0000 #define SND_SEQ_EVENT_TIMER_STOP 0x0001 #define SND_SEQ_EVENT_TIMER_CONTINUE 0x0002 #define SND_SEQ_EVENT_TIMER_TIMEBASE 0x0003 #define SND_SEQ_EVENT_TIMER_TEMPO 0x0004 #define SND_SEQ_EVENT_TIMER_METRONOME 0x0005 struct { union { unsigned int timebase; unsigned int tempo; unsigned char data[18]; } data; } snd_seq_event_timer; /* TODO: SND_SEQ_EVENT_SYNTH_XXXX */ Synthesizer should use static voice allocation or dynamic voice allocation (note that these events doesn't have anything with MIDI). struct { unsigned char voice; /* voice # (0-255) */ union { unsigned char data[19]; } data; } snd_seq_event_synth; struct { unsigned char len; unsigned char mdata[3]; unsigned char reserved[16]; } snd_seq_event_midi; struct { unsigned short offset; unsigned char len; unsigned char eox; /* end of SysEx */ unsigned char data[16]; /* SysEx data */ } snd_seq_event_sysex; Note: Driver accepts/sends only complete sysex. If some sysex event is for some reason lost, sysex isn't accepted. Queue registering in detail: ---------------------------- Device /dev/sndseq should be opened with O_RDWR (input/output), O_RDONLY (input) and O_WRONLY (output) flags. #define SND_SEQ_QUEUE_TIME_MODE_TICK 0 #define SND_SEQ_QUEUE_TIME_MODE_US 1 struct { unsigned char name[32]; /* application (client) name */ unsigned char key[32]; unsigned int queue: 5, /* queue identification (returned by driver) */ time_mode: 3; unsigned char reserved[16]; } snd_seq_queue; #define SND_SEQ_IOCTL_ATTACH _IOWR( 'Q', 0x10, struct snd_seq_queue ) #define SND_SEQ_IOCTL_DETACH _IOWR( 'Q', 0x11, struct snd_seq_queue ) snd_seq_queue -> key - contains unique queue identification - maybe key "promisc" will be reserved for promiscous mode (application should receive all events from all queues) - should be good for instrument manager Note: If requested queue doesn't exist, new queue is created. Client registering in detail: ----------------------------- When application attach to event queue, all other application clients receive connect event. #define SND_SEQ_CLIENT_INPUT 0x01 #define SND_SEQ_CLIENT_OUTPUT 0x02 #define SND_SEQ_CLIENT_DUPLEX 0x03 #define SND_SEQ_CLIENT_DISCONNECT 0x04 struct { unsigned char client; /* client number */ unsigned char flags; /* client flags */ unsigned char name[32]; /* client name */ unsigned char reserved[32]; } snd_seq_client; #define SND_SEQ_IOCTL_CLIENT_INFO _IOWR( 'Q', 0x20, struct snd_seq_client ) #define SND_SEQ_IOCTL_CLIENT_OPEN _IOWR( 'Q', 0x21, struct snd_seq_client ) #define SND_SEQ_IOCTL_CLIENT_CLOSE _IOWR( 'Q', 0x22, struct snd_seq_client ) Timer in detail: ---------------- struct { unsigned char timer; /* timer number */ unsigned char flags; /* timer flags */ unsigned char name[32]; /* timer name */ unsigned int min, max; /* min & max supported time resolution in us */ unsigned int step; /* time step in us */ } snd_seq_timer; #define SND_SEQ_IOCTL_TIMER_INFO _IOWR( 'Q', 0x30, struct snd_seq_timer ) #define SND_SEQ_IOCTL_TIMER_OPEN _IOWR( 'Q', 0x31, struct snd_seq_timer ) #define SND_SEQ_IOCTL_TIMER_CLOSE _IOWR( 'Q', 0x32, struct snd_seq_timer ) #define SND_SEQ_IOCTL_TIMER_ATTACH _IOWR( 'Q', 0x33, struct snd_seq_timer ) #define SND_SEQ_IOCTL_TIMER_DETACH _IOWR( 'Q', 0x34, struct snd_seq_timer )