Prev: rgb2hsv
Next: vlookup
From: Steven Lord on

"Amir Homayoun " <a.h.javadi(a)gmail.com> wrote in message
news:hihr9i$iv2$1(a)fred.mathworks.com...
> Dear Titus
>
> Thanks. I tried both "sched = findResource('scheduler', 'type',
> 'jobmanager');" and "sched = findResource('scheduler', 'configuration',
> 'jobmanager');" and I received the following error message,
> Warning: Could not contact any job manager lookup process. You may not
> have started a job manager, or multicast protocols may be failing on your
> network. If you are
> certain that a job manager is running, try findResource with a 'lookupURL'
> input.
>> In findResource>iCreateAccessor at 305
> In findResource>iFindJobManagers at 160
> In findResource>iFindScheduler at 263
> In findResource at 139
>
> I copied the file "distcompUserConfig.m" to my working directory (current
> directory) and tried again but still nothing.
>
> I looked up in the forum and I saw that another person had the same
> problem in 2007.
> Before I continue let me tell you about my system setup. I have one Mac
> computer with 8 cores and I want to run 8 parallel jobs on the same
> machine. It is not a network connection.
> There, in 2007, you mentioned that the distributing computing engine must
> be started using "mdce" command. I navigated to ...toolbox/distcomp/bin
> and entered the following command from MATLAB command window
>
> !mdce install
>
> and it returned me the following error message
>
> /bin/bash: mdce: command not found

Try:

!./mdce install

*snip*

--
Steve Lord
slord(a)mathworks.com
comp.soft-sys.matlab (CSSM) FAQ: http://matlabwiki.mathworks.com/MATLAB_FAQ


From: Titus Edelhofer on
Hi,
yes, haven't thought of that one as an error source.
By the way, on Mac you probably only need "!./mdce start", not "!./mdce
install". The install is only on Windows ...

Titus

"Steven Lord" <slord(a)mathworks.com> schrieb im Newsbeitrag
news:hii06a$4bq$1(a)fred.mathworks.com...
>
> "Amir Homayoun " <a.h.javadi(a)gmail.com> wrote in message
> news:hihr9i$iv2$1(a)fred.mathworks.com...
>> Dear Titus
>>
>> Thanks. I tried both "sched = findResource('scheduler', 'type',
>> 'jobmanager');" and "sched = findResource('scheduler', 'configuration',
>> 'jobmanager');" and I received the following error message,
>> Warning: Could not contact any job manager lookup process. You may not
>> have started a job manager, or multicast protocols may be failing on your
>> network. If you are
>> certain that a job manager is running, try findResource with a
>> 'lookupURL' input.
>>> In findResource>iCreateAccessor at 305
>> In findResource>iFindJobManagers at 160
>> In findResource>iFindScheduler at 263
>> In findResource at 139
>>
>> I copied the file "distcompUserConfig.m" to my working directory (current
>> directory) and tried again but still nothing.
>>
>> I looked up in the forum and I saw that another person had the same
>> problem in 2007.
>> Before I continue let me tell you about my system setup. I have one Mac
>> computer with 8 cores and I want to run 8 parallel jobs on the same
>> machine. It is not a network connection.
>> There, in 2007, you mentioned that the distributing computing engine must
>> be started using "mdce" command. I navigated to ...toolbox/distcomp/bin
>> and entered the following command from MATLAB command window
>>
>> !mdce install
>>
>> and it returned me the following error message
>>
>> /bin/bash: mdce: command not found
>
> Try:
>
> !./mdce install
>
> *snip*
>
> --
> Steve Lord
> slord(a)mathworks.com
> comp.soft-sys.matlab (CSSM) FAQ:
> http://matlabwiki.mathworks.com/MATLAB_FAQ
>
>


From: Amir Homayoun on
Hi

Thanks for your reply. OK. Finally, I could run the engine. I had to make some changes to the file "mdce_def.sh" so that it refers to valid directory names. Here are the messages which I received during and after starting the engine. I started one job manager and one worker.

>> !./startjobmanager
< no message >

>> !./startworker
Warning: Using multicast to locate the job manager because neither the command
line arguments nor the mdce_def file specifies the job manager hostname or
how to locate the job manager. This warning will not appear if you use
either the -jobmanagerhost or -multicast flag.
For more information, run the startworker command with the -help flag.

>> !./nodestatus -infolevel 3
Job manager lookup process:
Status Running
Job manager:
Name default_jobmanager
Running on host Amir-H-Javadi.local < it is my name >
Number of workers 1
Worker names and host names Amir-H-Javadi.local_worker, Amir-H-Javadi.local
Start time Wed Jan 13 08:20:10 GMT 2010
Port 27355
Requested job manager lookup
processes Amir-H-Javadi.local:27350 < it is my name >
Registered with job manager
lookup processes on hosts Amir-H-Javadi.local:27350 < it is my name >
Database size in bytes 227515
VM heap size in bytes 5353472
Worker lease timeout in msec 60000
Network addresses of host 127.0.0.1
fe80:0:0:0:0:0:0:1%1
0:0:0:0:0:0:0:1
fe80:0:0:0:21f:5bff:fe3b:cce8%4
128.40.254.221 < it is the static IP address of my machine >
Worker:
Name Amir-H-Javadi.local_worker
Running on host Amir-H-Javadi.local
Status Idle
Job manager default_jobmanager
Connection with job manager Connected
Job manager hostname Amir-H-Javadi.local
Start time Wed Jan 13 08:21:28 GMT 2010
Port 27356
Requested job manager lookup
processes Using multicast
Registered with job manager
lookup processes on hosts Amir-H-Javadi.local:27350
File dependencies directory /applications/matlab74/toolbox/distcomp/user/lib/Amir-H-Javadi.local_Amir-H-Javadi.local_worker_mlworker_log/matlabDependencyDir
Worker startup directory /applications/matlab74/toolbox/distcomp/user/lib/Amir-H-Javadi.local_Amir-H-Javadi.local_worker_mlworker_log/work
Network addresses of host 127.0.0.1
fe80:0:0:0:0:0:0:1%1
0:0:0:0:0:0:0:1
fe80:0:0:0:21f:5bff:fe3b:cce8%4
128.40.254.221 < it is the static IP address of my machine >
Summary:
The mdce service on Amir-H-Javadi.local manages the following processes:
Job manager lookup processes 1
Job managers 1
Workers 1




When I want to find the resources, it gives me a warning message in red color as following,

>> jm = findResource('scheduler','configuration','jobmanager')
The job manager computer is unable to open a TCP connection back to this computer.
You will not be able to transfer data of size greater than 246723 bytes
between this computer and the job manager. Callback functions will also not work.
====================================================
Possible reasons for this problem are:
1. The job manager cannot resolve the short hostname of this computer.
2. This computer has multiple hostnames and the Distributed Computing Toolbox is using one that is unresolvable on the job manager.
3. A firewall is blocking communication between the job manager and this computer.
4. Network routers are unable to route traffic from the job manager to this computer.
Refer to the Troubleshooting section of the documentation for detailed debugging instructions.
The hostname used by the Distributed Computing Toolbox on this computer is: Amir-H-Javadi
The fully qualified hostname of this computer is unknown
The IP addresses of this computer are: 127.0.0.1, fe80:0:0:0:0:0:0:1%1, 0:0:0:0:0:0:0:1, fe80:0:0:0:21f:5bff:fe3b:cce8%4, 128.40.254.221 < it is the static IP address of my machine >

The job manager name is: default_jobmanager
The hostname of the job manager computer is: Amir-H-Javadi.local
which resolves to the fully qualified hostname: 128.40.254.221 < it is the static IP address of my machine >
The IP addresses of the job manager computer are: 127.0.0.1, fe80:0:0:0:0:0:0:1%1, 0:0:0:0:0:0:0:1, fe80:0:0:0:21f:5bff:fe3b:cce8%4, 128.40.254.221 < it is the static IP address of my machine >
====================================================
java.rmi.UnknownHostException: Unknown host: Amir-H-Javadi; nested exception is:
java.net.UnknownHostException: Amir-H-Javadi
The cause of this problem is:
====================================================
Amir-H-Javadi

This is causing:

Unknown host: Amir-H-Javadi; nested exception is:
java.net.UnknownHostException: Amir-H-Javadi
====================================================

jm =
Jobmanager Information
======================
Type : jobmanager
ClusterOsType : unix
DataLocation : database on default_jobmanager(a)Amir-H-Javadi.l...
- Assigned Jobs
Number Pending : 0
Number Queued : 0
Number Running : 0
Number Finished : 0
- Jobmanager Specific Properties
Name : default_jobmanager
Hostname : Amir-H-Javadi.local
HostAddress(s) : fe80:0:0:0:0:0:0:1%1
: fe80:0:0:0:21f:5bff:fe3b:cce8%4
: 128.40.254.221
State : running
NumberOfIdleWorkers : 1
NumberOfBusyWorkers : 0




OK. I can also create simple and parallel jobs as

>> j = createParallelJob(jm)
j =
Parallel Job ID 5 Information
=============================
UserName : ajavadi
State : pending
SubmitTime :
StartTime :
Running Duration :
- Data Dependencies
FileDependencies : {}
PathDependencies : {}
- Associated Task(s)
Number Pending : 0
Number Running : 0
Number Finished : 0
TaskID of errors :
- Jobmanager Dependent Properties
MaximumNumberOfWorkers : Inf
MinimumNumberOfWorkers : 1
Timeout : Inf
RestartWorker : false
QueuedFcn :
RunningFcn :
FinishedFcn :



But when I want to create a task, it gives me the following error message,

>> createTask(j, @Permutation, 1, {InputVar});
??? Error using ==> distcomp.job.pCreateTask at 92
The job manager could not contact this MATLAB session on hostname Amir-H-Javadi and port 27370.
Using the findResource command to find the job manager may provide a more detailed error message.

What should I do now? As I mentioned before, I want to run all the tasks on my local machine. Sorry that my message got so long.

Thanks again,

Have a good time
Amir


"Titus Edelhofer" <titus.edelhofer(a)mathworks.de> wrote in message <hii15v$8uf$1(a)fred.mathworks.com>...
> Hi,
> yes, haven't thought of that one as an error source.
> By the way, on Mac you probably only need "!./mdce start", not "!./mdce
> install". The install is only on Windows ...
>
> Titus
>
From: Titus Edelhofer on
Hi Amir,

hmm, now it becomes difficult ;-). It looks like a mismatch in host names:
Amir-H-Javadi and Amir-H-Javadi.local

There are only two suggestions left that I could give:
- try to start the worker with the jobmanagerhost parameter set:
!./stopworker
!./startworker -jobmanagerhost 128.40.254.221 or
!./startworker -jobmanagerhost Amir-H-Javadi
- if this doesn't work. contact the Technical Support from The MathWorks

Titus

"Amir Homayoun " <a.h.javadi(a)gmail.com> schrieb im Newsbeitrag
news:hik1pp$a1$1(a)fred.mathworks.com...
> Hi
>
> Thanks for your reply. OK. Finally, I could run the engine. I had to make
> some changes to the file "mdce_def.sh" so that it refers to valid
> directory names. Here are the messages which I received during and after
> starting the engine. I started one job manager and one worker.
>
>>> !./startjobmanager
> < no message >
>
>>> !./startworker
> Warning: Using multicast to locate the job manager because neither the
> command
> line arguments nor the mdce_def file specifies the job manager hostname or
> how to locate the job manager. This warning will not appear if you use
> either the -jobmanagerhost or -multicast flag.
> For more information, run the startworker command with the -help flag.
>
>>> !./nodestatus -infolevel 3
> Job manager lookup process:
> Status Running
> Job manager:
> Name default_jobmanager
> Running on host Amir-H-Javadi.local < it is my name >
> Number of workers 1
> Worker names and host names Amir-H-Javadi.local_worker,
> Amir-H-Javadi.local
> Start time Wed Jan 13 08:20:10 GMT 2010
> Port 27355
> Requested job manager lookup processes
> Amir-H-Javadi.local:27350 < it is my name >
> Registered with job manager lookup processes on hosts
> Amir-H-Javadi.local:27350 < it is my name >
> Database size in bytes 227515
> VM heap size in bytes 5353472
> Worker lease timeout in msec 60000
> Network addresses of host 127.0.0.1
> fe80:0:0:0:0:0:0:1%1
> 0:0:0:0:0:0:0:1
> fe80:0:0:0:21f:5bff:fe3b:cce8%4
> 128.40.254.221 < it is the static IP
> address of my machine >
> Worker:
> Name Amir-H-Javadi.local_worker
> Running on host Amir-H-Javadi.local
> Status Idle
> Job manager default_jobmanager
> Connection with job manager Connected
> Job manager hostname Amir-H-Javadi.local
> Start time Wed Jan 13 08:21:28 GMT 2010
> Port 27356
> Requested job manager lookup processes Using
> multicast
> Registered with job manager lookup processes on hosts
> Amir-H-Javadi.local:27350
> File dependencies directory
> /applications/matlab74/toolbox/distcomp/user/lib/Amir-H-Javadi.local_Amir-H-Javadi.local_worker_mlworker_log/matlabDependencyDir
> Worker startup directory
> /applications/matlab74/toolbox/distcomp/user/lib/Amir-H-Javadi.local_Amir-H-Javadi.local_worker_mlworker_log/work
> Network addresses of host 127.0.0.1
> fe80:0:0:0:0:0:0:1%1
> 0:0:0:0:0:0:0:1
> fe80:0:0:0:21f:5bff:fe3b:cce8%4
> 128.40.254.221 < it is the static IP
> address of my machine >
> Summary:
> The mdce service on Amir-H-Javadi.local manages the following processes:
> Job manager lookup processes 1
> Job managers 1
> Workers 1
>
>
>
>
> When I want to find the resources, it gives me a warning message in red
> color as following,
>
>>> jm = findResource('scheduler','configuration','jobmanager')
> The job manager computer is unable to open a TCP connection back to this
> computer.
> You will not be able to transfer data of size greater than 246723 bytes
> between this computer and the job manager. Callback functions will also
> not work.
> ====================================================
> Possible reasons for this problem are:
> 1. The job manager cannot resolve the short hostname of this computer.
> 2. This computer has multiple hostnames and the Distributed Computing
> Toolbox is using one that is unresolvable on the job manager.
> 3. A firewall is blocking communication between the job manager and this
> computer.
> 4. Network routers are unable to route traffic from the job manager to
> this computer.
> Refer to the Troubleshooting section of the documentation for detailed
> debugging instructions.
> The hostname used by the Distributed Computing Toolbox on this computer
> is: Amir-H-Javadi
> The fully qualified hostname of this computer is unknown
> The IP addresses of this computer are: 127.0.0.1, fe80:0:0:0:0:0:0:1%1,
> 0:0:0:0:0:0:0:1, fe80:0:0:0:21f:5bff:fe3b:cce8%4, 128.40.254.221 < it is
> the static IP address of my machine >
>
> The job manager name is: default_jobmanager
> The hostname of the job manager computer is: Amir-H-Javadi.local
> which resolves to the fully qualified hostname: 128.40.254.221 < it is
> the static IP address of my machine >
> The IP addresses of the job manager computer are: 127.0.0.1,
> fe80:0:0:0:0:0:0:1%1, 0:0:0:0:0:0:0:1, fe80:0:0:0:21f:5bff:fe3b:cce8%4,
> 128.40.254.221 < it is the static IP address of my machine >
> ====================================================
> java.rmi.UnknownHostException: Unknown host: Amir-H-Javadi; nested
> exception is: java.net.UnknownHostException: Amir-H-Javadi
> The cause of this problem is:
> ====================================================
> Amir-H-Javadi
> This is causing:
> Unknown host: Amir-H-Javadi; nested exception is:
> java.net.UnknownHostException: Amir-H-Javadi
> ====================================================
>
> jm =
> Jobmanager Information
> ======================
> Type : jobmanager
> ClusterOsType : unix
> DataLocation : database on
> default_jobmanager(a)Amir-H-Javadi.l...
> - Assigned Jobs
> Number Pending : 0
> Number Queued : 0
> Number Running : 0
> Number Finished : 0
> - Jobmanager Specific Properties
> Name : default_jobmanager
> Hostname : Amir-H-Javadi.local
> HostAddress(s) : fe80:0:0:0:0:0:0:1%1
> : fe80:0:0:0:21f:5bff:fe3b:cce8%4
> : 128.40.254.221
> State : running
> NumberOfIdleWorkers : 1
> NumberOfBusyWorkers : 0
>
>
>
>
> OK. I can also create simple and parallel jobs as
>
>>> j = createParallelJob(jm)
> j =
> Parallel Job ID 5 Information
> =============================
> UserName : ajavadi
> State : pending
> SubmitTime : StartTime : Running Duration : - Data
> Dependencies
> FileDependencies : {}
> PathDependencies : {}
> - Associated Task(s)
> Number Pending : 0
> Number Running : 0
> Number Finished : 0
> TaskID of errors : - Jobmanager Dependent Properties
> MaximumNumberOfWorkers : Inf
> MinimumNumberOfWorkers : 1
> Timeout : Inf
> RestartWorker : false
> QueuedFcn : RunningFcn : FinishedFcn :
>
>
> But when I want to create a task, it gives me the following error message,
>
>>> createTask(j, @Permutation, 1, {InputVar});
> ??? Error using ==> distcomp.job.pCreateTask at 92
> The job manager could not contact this MATLAB session on hostname
> Amir-H-Javadi and port 27370.
> Using the findResource command to find the job manager may provide a more
> detailed error message.
>
> What should I do now? As I mentioned before, I want to run all the tasks
> on my local machine. Sorry that my message got so long.
>
> Thanks again,
>
> Have a good time
> Amir
>
>
> "Titus Edelhofer" <titus.edelhofer(a)mathworks.de> wrote in message
> <hii15v$8uf$1(a)fred.mathworks.com>...
>> Hi,
>> yes, haven't thought of that one as an error source.
>> By the way, on Mac you probably only need "!./mdce start", not "!./mdce
>> install". The install is only on Windows ...
>>
>> Titus
>>


First  |  Prev  | 
Pages: 1 2 3
Prev: rgb2hsv
Next: vlookup